v2026.1 Open Portal ↗
On this page

CMDB & Discovery

Discovery Failures

CMDB discovery failures prevent accurate CI data and can impact impact analysis, change risk scoring, and AI triage accuracy. Most discovery failures are caused by network connectivity issues, credential problems, or cloud API rate limiting. Check the Discovery Logs in CMDB → Discovery → Discovery Logs for error details.

⚙️ Minimum Requirements
  • CloudWatch Logs: /aws/lambda/StackFlowCloudDiscovery and /aws/lambda/StackFlowNeptuneCMDBSeeder log groups
  • DynamoDB: StackFlow_DiscoveryJob table accessible to check last discovery run status and errors
  • Neptune: Graph query endpoint accessible from diagnostic tooling within VPC
  • IAM: StackFlowDiscoveryRole cross-account trust active in all connected accounts

Agent Issues

SymptomLikely CauseDiagnostic StepResolution
Agent last heartbeat > 30 minAgent service stopped or network issuesystemctl status stackflow-agent on target hostRestart agent service; check firewall rules for HTTPS outbound
Agent showing wrong hostnameHostname changed after agent installCheck /etc/stackflow-agent/config.yaml for overrideUpdate hostname_override in config; restart agent
Agent token expiredToken revoked or rotatedCheck token status in Admin → Discovery → Agent TokensGenerate new token, update config.yaml on target host
Hardware inventory incompleteAgent lacks root/admin accessCheck agent logs for "Permission denied" errorsRun agent as root (Linux) or SYSTEM (Windows)

Cloud Discovery Issues

SymptomLikely CauseDiagnostic StepResolution
AWS account shows 0 CIsCross-account role trust policy wrongTest role assumption: aws sts assume-role --role-arn arn:aws:iam::{account}:role/StackFlowDiscoveryRole --role-session-name testFix trust policy external ID condition
Azure discovery fails with 401Azure service principal secret expiredCheck expiry in Azure Portal → App Registrations → Certificates & SecretsRotate client secret, update Secrets Manager entry
Partial AWS discovery (some services missing)Missing IAM permissions for specific servicesCheck Lambda CloudWatch logs for AccessDenied during discoveryAdd missing read permissions to StackFlowDiscoveryPolicy
Discovery Logs: Detailed discovery logs are in CloudWatch under /aws/lambda/StackFlowAPI filtered by discovery. Each discovery run logs start, per-service counts, any errors, and total CI count at completion.

Neptune Sync Issues

aws lambda invoke   --function-name StackFlowNeptuneCMDBSeeder   --payload '{"mode": "delta", "dry_run": true}'   --region us-east-1   output.json
cat output.json

If the seeder Lambda is failing, check its CloudWatch logs. Common issues include Neptune connection timeouts (check Neptune cluster status), Gremlin serialization errors on malformed CI data, and Lambda timeout when syncing very large CI sets (reduce batch size in seeder configuration).

CMDB Data Quality

Data quality issues manifest as incorrect impact analysis results, wrong CI counts in dashboards, or duplicate CI records. The CMDB health dashboard at CMDB → Health shows: duplicate detection results, orphaned CIs (no relationships), stale CIs (no update in 30+ days), and missing mandatory attributes. Address these proactively to maintain AI analysis quality.