vkrnt

Technology. Leadership.

SRE / SLI, SLO, SLA

What are SLI, SLO, SLA? Where are they used?

SLI, SLO, SLA
Service Level Indicators are core metrics used to measure a service. Latency is the most common SLI - how long it takes for a service to respond. Check out SLI definition in the SRE book from Google. Other metrics used as SLIs are availability, error rate and throughput. Availability is a big favorite with SREs; you will here three nines, five nines being flung around in SRE discussions all the time. The target audience for SLIs is mostly the SREs (site reliability engineers).

Service Level Objectives are thresholds set by the SRE team for internal reference. These are boxed in a pre-decided time period (monthly, quarterly). SLO definition is typically a range for a target SLI. For example, 'our Q1 SLO is to keep average latency less than 10ms (0-10ms) for the search service'.

Service Level Agreements are business facing agreements. Unlike SLOs, missing an SLA has consequences beyond a slap on the knuckles. More often than not, there is a financial impact of missing an SLA (lost sales because of slow site response).