What is Mean Time to Recovery?
Average time to restore service after an incident.
How to calculate it
Calculate Mean Time to Recovery as: Total downtime / Number of incidents. Pull the inputs from your connected data and track the trend over time in your dashboard.
Examples
Example 1
120 minutes total downtime across 4 incidents -> 30-minute MTTR.
Example 2
120 minutes of total downtime across 4 incidents -> 30-minute MTTR, within the elite range thanks to good alerting and runbooks.
Why it matters
Mean time to recovery (MTTR) is the average time to restore service after an incident and is a DORA measure of resilience. Fast recovery limits the customer and revenue impact of inevitable failures. Excluding detection time understates true MTTR and can mask slow alerting.
Benchmark context
Elite teams recover in under one hour; longer recovery times point to gaps in monitoring, runbooks or on-call processes.
Common pitfalls
Excluding detection time.
Related KPI guides
Turn KPI definitions into governed dashboards
Metricwise helps teams define metrics once, reuse them across dashboards, and ask trusted business questions in plain English.
Get Started