v2026.1 Open Portal ↗
On this page

Major Incident Management

Major Incident Definition

A Major Incident (MI) is a high-impact, urgent P1 incident that affects a significant portion of the user base or a critical business service. StackFlow automatically escalates incidents to Major Incident status based on configurable criteria: P1 priority + no acknowledgment within 15 minutes, or P1 + more than 50 affected users detected via the CMDB dependency analysis.

⚙️ Minimum Requirements
  • DynamoDB: StackFlow_MajorIncident table with warRoomId attribute and GSI on status
  • SNS Topic: stackflow-major-incident-alerts with subscriptions for all stakeholder groups
  • SES: Major incident communications require major-incidents@stackflow-tech.com verified in SES
  • Role: Declaring a major incident requires itsm_manager or super_admin JWT claim
  • Lambda: StackFlowMajorIncidentNotifier deployed with SES SendEmail permission
Auto-Declaration: The auto-declaration threshold can be adjusted in Admin → System Properties → major_incident_auto_declare_threshold. Lowering this threshold in production may result in alert fatigue. Consult with your ITSM manager before changing.

War Room

When a Major Incident is declared, StackFlow creates a War Room — a dedicated collaboration space within the portal that consolidates all communication, technical updates, and action items in one place. The War Room includes a timeline of all events, a shared scratchpad for technical notes, and integration with your organization's chat platform (Slack/Teams).

curl -X POST https://your-instance.stackflow-tech.com/prod/api/major-incidents   -H "Authorization: Bearer $TOKEN"   -H "Content-Type: application/json"   -d '{
    "incident_id": "INC0001234",
    "major_incident_manager": "usr_alice_johnson",
    "bridge_link": "https://meet.google.com/abc-defg-hij",
    "initial_impact_statement": "Production API gateway returning 503 for all users"
  }'

Communication Bridge

The Communication Bridge section of the War Room tracks all conference bridge details and attendees. StackFlow can automatically post War Room updates to a designated Slack channel or Microsoft Teams channel, ensuring stakeholders receive real-time updates without needing to monitor the portal.

Stakeholder Updates

Pre-configured stakeholder communication templates ensure consistent messaging during a Major Incident. Templates are available for: Initial notification, 30-minute update, resolution notification, and post-incident summary. The AI Copilot can draft these communications based on the current incident state and timeline.

Update TypeTriggerAudience
Initial NotificationMI declaredIT leadership, affected dept heads
30-Minute UpdateEvery 30 minIT leadership
Status Page UpdateOn state changeAll users (via status page)
Resolution NotificationIncident resolvedAll stakeholders
PIR SummaryPIR complete (T+48h)IT leadership, ITSM manager

Post-Incident Review

A blameless Post-Incident Review (PIR) is automatically scheduled 48 hours after a Major Incident is resolved. The PIR template in StackFlow follows the Google SRE postmortem format and includes a pre-populated timeline built from the War Room activity log. AI-assisted PIR generation can create an initial draft PIR document within minutes of the incident being resolved.