v2026.1 Open Portal ↗
On this page

API & Lambda Errors

Error Classification

StackFlow API errors fall into client errors (4xx) and server errors (5xx). Client errors indicate issues with the request itself (bad parameters, missing auth, not found). Server errors indicate internal platform issues and require investigation. All 5xx errors generate a structured error response with a request_id that can be used to find the corresponding CloudWatch log entry.

⚙️ Minimum Requirements
  • CloudWatch Logs: /aws/lambda/StackFlowAPI log group; set retention to 30 days minimum
  • X-Ray: AWS X-Ray tracing enabled on StackFlowAPI Lambda and API Gateway uazcuhdus2
  • CloudWatch Alarms: StackFlowAPI-Errors and StackFlowAPI-Throttles alarms active with SNS notification
  • Lambda Concurrency: Reserved concurrency of at least 50 on StackFlowAPI to prevent throttling spikes
{
  "error": {
    "code": "INTERNAL_SERVER_ERROR",
    "message": "An unexpected error occurred",
    "request_id": "7f3b2c1d-8e4a-4f2b-9c1d-2e3f4a5b6c7d",
    "timestamp": "2026-05-18T14:23:11Z"
  }
}

500 Internal Server Errors

SymptomLikely CauseDiagnostic StepResolution
Consistent 500 on specific endpointCode bug or unhandled exceptionSearch CloudWatch logs by request_idCheck Lambda logs for stack trace, deploy fix
500 with "Connection timeout" in logsAurora max_connections reachedRun SELECT count(*), state FROM pg_stat_activity GROUP BY state on AuroraReduce Lambda concurrency or increase max_connections
500 with "Secret not found" in logsSecrets Manager secret deleted or rotated incorrectlyCheck secret ARN exists in Secrets ManagerRestore secret from backup or re-create with correct ARN
500 with KMS AccessDeniedExceptionLambda execution role missing KMS permissionCheck IAM role policy for kms:DecryptAdd kms:Decrypt permission for CMK to Lambda role

503 Service Unavailable

SymptomLikely CauseDiagnostic StepResolution
503 from API GatewayLambda throttled (concurrent execution limit hit)Check Lambda Throttles metric in CloudWatchRequest concurrency limit increase from AWS Support
503 from CloudFrontAPI Gateway 5XX error rate highCheck CloudFront distribution error rateInvestigate underlying API Gateway/Lambda errors
503 with "Circuit breaker open"Too many consecutive errors tripped the breakerCheck StackFlow circuit breaker state in RedisWait for reset interval or manually reset via admin API

Cold Start Issues

Lambda cold starts add 1-4 seconds of latency to the first request after a Lambda instance is created. StackFlow mitigates cold starts via the StackFlowCacheWarmer Lambda which pings the API every 4 minutes to keep instances warm. However, traffic spikes can cause new instances to spin up with cold starts.

aws cloudwatch get-metric-statistics   --namespace AWS/Lambda   --metric-name InitDuration   --dimensions Name=FunctionName,Value=StackFlowAPI   --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)   --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ)   --period 300 --statistics p99   --region us-east-1
Provisioned Concurrency: For production instances with strict latency SLAs, consider enabling Lambda Provisioned Concurrency for the StackFlowAPI function. This eliminates cold starts by pre-initializing Lambda instances, at the cost of additional hourly charges.

Timeout Errors

The StackFlowAPI Lambda has a 300-second timeout. Requests exceeding this timeout return a 504 Gateway Timeout from API Gateway. Common causes of timeouts include: slow Bedrock API responses, Neptune graph traversals on very large graphs (10k+ nodes without proper indexing), and Aurora queries missing indexes on commonly filtered columns.

aws logs filter-log-events   --log-group-name /aws/lambda/StackFlowAPI   --filter-pattern "Task timed out"   --start-time $(date -d '1 hour ago' +%s000)   --region us-east-1