Bedrock Vector Search
Knowledge Base Configuration
The StackFlow Knowledge Base is powered by Amazon Bedrock Knowledge Bases with OpenSearch Serverless as the vector store. It stores embedded representations of all runbooks, policies, SLA definitions, and CMDB documentation, enabling semantic search across the entire platform knowledge corpus.
⚙️ Minimum Requirements
- Knowledge Base:
BXJGG7PIPSstatusACTIVEinus-east-1 - S3 Bucket:
stackflow-kb-documents-373544523367with documents inrunbooks/,policies/,sla/,cmdb/prefixes - Embedding Model:
amazon.titan-embed-text-v2:0enabled for account373544523367 - OpenSearch: Serverless collection
q3oso7unldm9p4xsqez4active with sufficient OCU capacity (min 2 OCU indexing + 2 OCU search) - IAM:
StackFlowBedrockKBRolewithaoss:APIAccessAllon collection ARN ands3:GetObjecton KB bucket
| Property | Value |
|---|---|
| Knowledge Base ID | BXJGG7PIPS |
| Name | StackFlow-KnowledgeBase |
| Embedding Model | amazon.titan-embed-text-v2:0 |
| Embedding Dimensions | 1024 |
| Vector Store | OpenSearch Serverless |
| Collection ARN | arn:aws:aoss:us-east-1:373544523367:collection/q3oso7unldm9p4xsqez4 |
| Index Name | stackflow-kb-index |
| Vector Field | embedding |
| Text Field | content |
| Metadata Field | metadata |
| IAM Role | StackFlowBedrockKBRole |
| Status | ACTIVE |
Document Structure in S3
stackflow-kb-documents-373544523367/
├── runbooks/
│ ├── aurora-high-connections.md
│ ├── aurora-high-connections.md.metadata.json
│ ├── lambda-cold-start-remediation.md
│ ├── redis-auth-failure.md
│ ├── neptune-query-timeout.md
│ └── cloudfront-503-investigation.md
├── policies/
│ ├── change-management-policy.md
│ ├── incident-severity-matrix.md
│ └── sla-definitions.md
├── sla/
│ ├── p1-response-targets.md
│ ├── business-hours-schedule.md
│ └── escalation-matrix.md
└── cmdb/
├── ci-classification-guide.md
├── discovery-scope.md
└── neptune-schema-reference.md
Each document should have a companion .metadata.json file to improve retrieval ranking:
{
"metadataAttributes": {
"category": "runbook",
"service": "aurora",
"severity": "high",
"title": "Aurora High Connection Count Remediation",
"tags": ["aurora", "postgresql", "connection-pool", "rds-proxy"],
"lastUpdated": "2026-03-15"
}
}
Adding Documents to the KB
# 1. Upload the document with server-side encryption
aws s3 cp runbook.md s3://stackflow-kb-documents-373544523367/runbooks/runbook.md --sse AES256
# 2. Upload the metadata sidecar
aws s3 cp runbook.metadata.json s3://stackflow-kb-documents-373544523367/runbooks/runbook.md.metadata.json
# 3. Get the data source ID
aws bedrock-agent list-data-sources --knowledge-base-id BXJGG7PIPS --query 'dataSourceSummaries[0].dataSourceId' --output text
# 4. Trigger KB sync ingestion job
aws bedrock-agent start-ingestion-job --knowledge-base-id BXJGG7PIPS --data-source-id --region us-east-1
# 5. Monitor job status
aws bedrock-agent get-ingestion-job --knowledge-base-id BXJGG7PIPS --data-source-id --ingestion-job-id --query 'ingestionJob.{Status:status,Stats:statistics}'
Retrieve and Generate (Python)
import boto3
import json
bedrock = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
def query_kb(question: str, num_results: int = 5, session_id: str = None) -> dict:
"""
Query StackFlow Knowledge Base with retrieve-and-generate.
Uses hybrid search (semantic + keyword) for best recall.
"""
config = {
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'BXJGG7PIPS',
'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0',
'retrievalConfiguration': {
'vectorSearchConfiguration': {
'numberOfResults': num_results,
'overrideSearchType': 'HYBRID'
}
},
'generationConfiguration': {
'promptTemplate': {
'textPromptTemplate': '''You are a StackFlow ITSM admin assistant.
Answer based only on the provided context. If you cannot answer from context, say so clearly.
Be specific -- reference exact resource names, CLI commands, and configuration values when available.
Context:
$search_results$
Question: $query$
Answer:'''
},
'inferenceConfig': {
'textInferenceConfig': {
'maxTokens': 1024,
'temperature': 0.1,
'topP': 0.9
}
}
}
}
}
if session_id:
config['knowledgeBaseConfiguration']['sessionId'] = session_id
response = bedrock.retrieve_and_generate(
input={'text': question},
retrieveAndGenerateConfiguration=config
)
citations = []
for citation in response.get('citations', []):
for ref in citation.get('retrievedReferences', []):
citations.append({
'content': ref['content']['text'][:200],
'location': ref['location'].get('s3Location', {}).get('uri', ''),
'score': ref.get('score', 0)
})
return {
'answer': response['output']['text'],
'citations': citations,
'session_id': response.get('sessionId'),
'source_count': len(citations)
}
# Example usage
result = query_kb("How do I resolve high Aurora connection count?")
print(f"Answer: {result['answer']}")
print(f"Sources: {len(result['citations'])} citations")
for c in result['citations']:
print(f" - {c['location']}")
OpenSearch Index Mapping
{
"mappings": {
"properties": {
"id": { "type": "text" },
"content": { "type": "text", "analyzer": "english" },
"embedding": {
"type": "knn_vector",
"dimension": 1024,
"method": {
"name": "hnsw",
"space_type": "l2",
"engine": "nmslib",
"parameters": { "ef_construction": 512, "m": 16 }
}
},
"metadata": {
"properties": {
"category": { "type": "keyword" },
"service": { "type": "keyword" },
"severity": { "type": "keyword" },
"title": { "type": "text" },
"tags": { "type": "keyword" },
"lastUpdated": { "type": "date" }
}
}
}
},
"settings": {
"index": { "knn": true, "knn.algo_param.ef_search": 512 }
}
}
Hybrid Search Strategy
StackFlow uses OpenSearch's hybrid search mode, which combines BM25 keyword scoring with KNN vector similarity. This provides better results than either approach alone:
- Semantic search catches paraphrased queries: "database won't connect" matches "Aurora connection pool exhausted runbook"
- Keyword search ensures exact technical terms are found: "ECONNREFUSED", "OBO token", specific error codes
- Score normalization: Both scores are normalized to [0,1] and combined with weights: 0.6 * semantic + 0.4 * keyword
import boto3
bedrock_agent = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
def retrieve_only(query: str, num_results: int = 10) -> list:
"""Retrieve passages without generation -- useful for debugging retrieval quality."""
response = bedrock_agent.retrieve(
knowledgeBaseId='BXJGG7PIPS',
retrievalQuery={'text': query},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': num_results,
'overrideSearchType': 'HYBRID'
}
}
)
return [{
'text': r['content']['text'][:300],
'score': r['score'],
'uri': r['location'].get('s3Location', {}).get('uri', 'N/A')
} for r in response['retrievalResults']]
# Test retrieval quality
passages = retrieve_only("Aurora connection pool exhausted Lambda")
for p in passages:
print(f"Score {p['score']:.3f}: {p['uri']}")
print(f" {p['text'][:100]}...")
print()