Neptune Knowledge Graph
Graph Architecture
The Amazon Neptune cluster (stackflow-knowledge-graph.cluster-c6pq0smgmlri.us-east-1.neptune.amazonaws.com:8182) stores the StackFlow CMDB relationship graph. Neptune uses a property graph model where vertices represent CIs and edges represent relationships between them. The graph enables millisecond-latency graph traversals for impact analysis, dependency mapping, and blast radius calculations.
- Neptune Cluster:
stackflow-knowledge-graphon engine 1.4.7.0, port 8182, IAM auth enabled - KMS: Neptune encrypted with key
98d8c3a0-280f-4c1e-b1ff-0d4029120bdb - IAM:
StackFlowNeptuneRolewithneptune-db:*actions on the cluster ARN - VPC: Neptune cluster in same VPC (
vpc-0c4e3c18734dee8f7) and subnets as Lambda functions - Seeder Lambda:
StackFlowNeptuneCMDBSeederwith correct Neptune endpoint env var
Neptune runs in the same VPC (vpc-0c4e3c18734dee8f7) as the StackFlowAPI Lambda, enabling direct Gremlin WebSocket connections without network egress. The cluster uses Multi-AZ replication across us-east-1a and us-east-1b with automatic failover.
Vertex and Edge Schema
| Element | Type | Key Properties |
|---|---|---|
| CI Vertex | Vertex | ci_id, ci_class, name, state, tenant_id |
| depends_on | Edge | dependency_type, criticality, since |
| hosted_on | Edge | hosting_type (physical, virtual, cloud) |
| communicates_with | Edge | protocol, port, direction |
| owned_by | Edge | ownership_type |
| member_of | Edge | cluster, group type |
Gremlin Query Examples
// Find all CIs that depend on the production Aurora cluster
g.V().has('ci_id', 'ci_aurora_main_prod')
.in('depends_on')
.values('name', 'ci_class', 'state')
// Blast radius: all CIs within 3 hops of a given CI
g.V().has('ci_id', 'ci_aurora_main_prod')
.repeat(__.in('depends_on').simplePath())
.times(3)
.dedup()
.values('name')
// Find orphaned CIs (no relationships)
g.V().hasLabel('ci')
.where(__.not(__.bothE()))
.values('ci_id', 'name')
Neptune Maintenance
Neptune maintenance windows are configured in the AWS RDS console. Schedule maintenance during low-activity periods (typically Sunday 2-4 AM UTC). Before major Neptune engine upgrades, always test the upgrade in a Neptune restore from the latest snapshot. The StackFlowNeptuneCMDBSeeder Lambda may need to be restarted after Neptune engine upgrades to re-establish the Gremlin connection pool.
tenant_id vertex property filter to maintain tenant isolation. Never run Gremlin queries without a tenant scope in production.
Performance Tuning
For optimal Neptune performance with large graphs (millions of vertices), use the following guidelines: limit traversal depth to a maximum of 5 hops, use simplePath() to avoid cycles, always filter by vertex label before property filters, and use Neptune's bulk loader for initial CMDB seeding rather than individual upserts.
# Check Neptune cluster status
aws neptune describe-db-clusters --db-cluster-identifier stackflow-knowledge-graph --region us-east-1
# View slow query logs
aws logs filter-log-events --log-group-name /aws/neptune/stackflow-knowledge-graph/audit --filter-pattern "duration > 1000" --region us-east-1