vkrnt

Technology. Leadership.

SRE / SLI, SLO, SLA /3 nines, 4 nines, 5 nines

How is availability defined? What are 3 nines, 4 nines, 5 nines of availability?

3 nines, 4 nines, 5 nines of availability
In an SRE environment, availability is an important SLI (what is SLI - read here). One way to represent availability is in percentage uptime, condensed as n-nines. For example, 3 nines of availability means the service was up and running 99.9% of time. A common mistake when defining availability SLOs (what's an SLO - read here) is to validate a service being 'available' from an Ops or Infra perspective, rather than the customer perspective. If a customer is not able to perform a transaction because of a service failing, that service should be deemed as 'not available'. Basically, availability should be from a customer's standpoint, not an Ops engineer's perspective.

Here are some common figures thrown around as target SLOs and their translation in minutes/seconds.

availability annual downtime budget quarterly downtime budget monthly downtime budget week downtime budget
One nine or 90% 36.5 days 9.1 days 72 hours 16.8 hours
Two nines or 99% 3.65 days 21.9 hours 7.2 hours 1.68 hours
- 3 nines is where we start getting serious -
Three nines or 99.9% 8.76 hours 2.2 hours 43.8 minutes 10.1 minutes
Four nines or 99.99% 52.56 minutes 13.14 minutes 4.32 minutes 1.01 minutes
Five nines or 99.999% 5.26 minutes 1.3 minutes 25.9 seconds 6.05 seconds