Learn/Four Golden Signals
MONITORING & OBSERVABILITY

The Four Golden Signals

The four key metrics that represent the health of a system: Latency, Traffic, Errors, and Saturation.

By Niketa Sharma, Founder at Runframe·Last updated Mar 2026
Four Golden Signals

The four key metrics that represent the health of a system: Latency, Traffic, Errors, and Saturation.

"The Vital Signs"

Just as a doctor checks heart rate and blood pressure, an SRE checks the Four Golden Signals.

1. Latency

The time it takes to service a request.

  • Tip: Distinguish between success latency (fast) and error latency (could be very fast or very slow).

2. Traffic

A measure of how much demand is being placed on your system.

  • Web: Requests per second (RPS).
  • Audio: Concurrent streams.

3. Errors

The rate of requests that fail.

  • Explicit: HTTP 500s.
  • Implicit: HTTP 200s with "Success: False" body (content errors).

4. Saturation

How "full" your service is.

  • CPU usage, Memory, Disk I/O.
  • Once saturation hits 100%, performance degrades rapidly (latency spikes).

ExThe Slow Disk

A service was slow, but CPU and Memory were low. No errors were firing.

Impact
Latency increased from 100ms to 2s.
Resolution
The team checked "Saturation" and found Disk I/O was at 100%. A logging process was spamming the disk. They throttled the logger, and latency recovered.

Why Four Golden Signals Matters

Standardized by Google SRE, these signals give you a high-level view of any system's health.

Monitoring these four signals is often enough to detect most user-facing incidents.

Common Pitfalls

Averages
Don't monitor "Average Latency". Monitor "p99 Latency". Averages hide outliers.

How to Use Four Golden Signals

⏱️
Latency: Time it takes to service a request.
🚦
Traffic: Demand on your system (e.g., RPS).
Errors: Rate of unused failed requests.

Frequently Asked Questions

Put this into practice.