CMDB Overview
CMDB Architecture
The StackFlow Configuration Management Database (CMDB) is a dual-layer system combining a relational store (Aurora PostgreSQL) for structured CI attributes with a graph database (Amazon Neptune) for CI relationships and dependency mapping. This architecture provides both fast attribute queries and efficient graph traversals for impact analysis.
- DynamoDB:
StackFlow_CItable (Configuration Item store) with GSI onciTypeandtenantId - Neptune:
stackflow-knowledge-graphcluster accessible on port 8182 from Lambda VPC - IAM Role:
StackFlowNeptuneCMDBSeederLambda role with Neptune IAM auth permissions - S3:
stackflow-cmdb-exports-373544523367bucket for CMDB export/import artifacts
The Neptune cluster at stackflow-knowledge-graph.cluster-c6pq0smgmlri.us-east-1.neptune.amazonaws.com:8182 stores all CI relationships as a property graph. Each CI is a vertex with edges representing relationships (depends_on, hosted_on, communicates_with, owned_by, etc.). The graph is queried via Gremlin.
StackFlowNeptuneCMDBSeeder Lambda function performs the initial and delta seeding of the Neptune graph from the Aurora CMDB records. It runs on a schedule (every 5 minutes for delta syncs) to keep the graph current.
Data Sources
| Source | Method | CI Types |
|---|---|---|
| AWS | Auto-discovery via Config/Systems Manager | EC2, RDS, Lambda, ELB, S3, VPC |
| Azure | Auto-discovery via Azure Resource Manager | VMs, App Services, SQL, Storage |
| GCP | Auto-discovery via GCP Asset Inventory | GCE, Cloud SQL, GCS, GKE |
| On-Premises | StackFlow agent (Linux/Windows) | Servers, services, ports, software |
| Manual | UI / API / CSV import | Any CI type |
| ITSM Integration | Extracted from incidents/changes | Referenced CIs |
CI Lifecycle
CIs transition through a defined lifecycle: Planned → In Development → Installed → Active → Maintenance → Retired → Disposed. State transitions can trigger automated workflows (e.g., decommission checklist on Retired), notifications, and ITSM change records. Historical state changes are retained for audit and compliance purposes.
Health Scoring
Each CI receives an automated health score (0-100) calculated from incident history, SLA breaches, maintenance records, and age. CIs with scores below 40 are flagged for proactive maintenance review in the Cloud Fleet Health dashboard. Health scores are recalculated every 15 minutes.
Integration with ITSM
CMDB integration enriches ITSM workflows at every stage. When an incident is created with an affected CI, the Neptune graph is queried to identify all dependent services and CIs, which are displayed in the incident's "Impact Analysis" tab. Changes reference CIs for risk calculation and trigger CMDB updates upon successful implementation.