# Mean time to recovery (MTTR)

## What is mean time to recovery?

Mean time to recovery (MTTR) is the average time a system takes to recover from an outage.

## How do you calculate mean time to recovery?

The mean time to recovery average is calculated by adding the downtime duration for each incident and dividing it by the number of incidents. Some maintenance personnel multiply this number by 100 to get their MTTR as a percentage. The formula to calculate mean time to recovery is:

MTTR

=

Downtime duration  ÷

Number of incidents

× 100

## Is a higher mean time to recovery better?

Mean time to recovery is a metric that tells you how quickly an organization can get back to normal after it experiences an outage. This number should be as low as possible. A high MTTR indicates that your company has trouble recovering from incidents quickly and efficiently and may have difficulty meeting expectations when something goes wrong.

For example, if you have a five-minute MTTR for your asset and there's an incident that causes a service disruption, then it would take about five minutes for all affected maintenance personnel (and their assets) to recover from the issue and resume normal operations.

## What affects mean time to recovery?

While MTTR is helpful to track over time, it's important to remember that it's affected by many factors outside of a maintenance team's control, such as the availability of resources during an outage or whether the cause of the problem was human error or a critical failure in an asset.

For example, if no resources are available during an outage and you have to wait for someone else to fix the problem before proceeding with your work, that will affect MTTR. Similarly, suppose the cause of the problem was human error (e.g., accidentally pulling out a cable in a machine) rather than a critical failure in infrastructure (e.g., a power outage). In that case, this, too, will impact how quickly things get back up and running again.

## What's the difference between mean time to repair, restore and recover?

Although mean time to repair, restore and recover all uses the same acronym (MTTR), they are different.

• Mean time to repair (MTTR): Typically refers to the average time it takes to repair or fix a failed component or piece of equipment.
• Mean time to restore (MTTR): Specifically measures the average time it takes to restore a system or service to its normal operational state after a disruption or incident. It encompasses the time required to fix the underlying issue and validate and verify that the system or service is fully restored and functioning correctly.
• Mean time to recovery (MTTR): Refers to the average time it takes to recover a system or service after an incident or failure occurs. It measures the time from when the incident is detected until the system or service is fully functional. Mean time to recovery helps maintenance teams assess the speed and efficiency of their incident response and resolution processes.

## Maintenance teams should be keeping track of how quickly they recover from an outage and how often it happens

Mean time to recovery is a valuable metric to track overtime and can help you identify trends in your recovery times. But it's important to remember that MTTR is affected by many factors outside of maintenance teams' control, such as the availability of resources during an outage or whether the cause of the problem was human error or a critical failure in infrastructure.