Learn/Triage
INCIDENT RESPONSE

Incident Triage

The initial phase of incident response where the severity, impact, and required expertise are determined.

By Niketa Sharma, Founder at RunframeยทLast updated Mar 2026
Triage

The initial phase of incident response where the severity, impact, and required expertise are determined.

The Art of Prioritization

Triage is a medical term adapted for DevOps. In an Emergency Room, triage nurses decide who needs a surgeon immediately and who can wait. In Incident Management, triage determines if an alert is a "Drop everything" (SEV1) or "Fix it next week" (SEV3).

The Triage Checklist

  1. Verify: Is this actually broken? (Eliminate false positives).
  2. Assess Impact: Who is affected? (All users? Just internal admins?).
  3. Assign Severity: Map impact to a SEV level (SEV1, SEV2, etc.).
  4. Route: Page the correct team.

ExE-commerce Platform Triage

โ€œDuring flash sale, checkout fails. Triage confirms impact: "All users, revenue impact = $10K/min". Immediately classified as SEV0, paged 3 engineers, and workaround implemented in 8 minutes.โ€

Impact
Saved $80K in potential revenue loss
Resolution
Root cause was database connection pool exhaustion

Why Triage Matters

Prevents high-severity incidents from being ignored.

Ensures the right people are paged, reducing noise for others.

Sets the pace for the entire incident response.

Common Pitfalls

Treating all alerts as SEV1
Use a severity matrix to consistently classify based on business impact, not just technical severity.
Triage in isolation without context
Check with service owners before escalating. A "down" database might be a planned maintenance window.

How to Use Triage

๐Ÿ“‹
Severity Matrix: Have a clear Definition of Severity table.
๐Ÿค–
Auto-Triage: Use tools to auto-label alerts based on payload.

Frequently Asked Questions

Put this into practice.