AWS Search and Catalog Architecture with OpenSearch and DynamoDB

How to Build a High-Scale Product Search and Catalog Platform on AWS with OpenSearch, DynamoDB, SQS, and EventBridge

Search systems collapse when index freshness is coupled to the write path or when teams mistake the search cluster for the source of truth. At scale, you need safe async indexing and reliable rebuild paths. An e-commerce or marketplace platform must serve rich keyword and faceted search over millions of products, keep results fresh despite frequent catalog updates, and preserve a source of truth that search cluster issues cannot corrupt.

TL;DR: Persist canonical catalog data first, publish product change events through EventBridge, queue indexing in SQS, and build OpenSearch as a derived serving layer with Redis caching in front of hot queries.

Why Naive Solutions Break

Using the search index as the primary database leads to inconsistency, painful reindexing, and brittle write paths. Synchronously indexing every catalog update inside product write APIs also turns catalog mutations into latency spikes and operational outages during index contention.

Architecture Overview

Store canonical catalog data in DynamoDB or Aurora depending on the domain, publish change events through EventBridge, buffer indexing jobs in SQS, build denormalized search documents in Lambda or ECS workers, and serve query traffic from OpenSearch behind CloudFront and API Gateway.

Architecture Diagram

Service-by-Service Breakdown

API Gateway: Public search endpoint with auth, throttling, and request normalization.
Lambda or ECS: Query-serving layer for ranking, filtering logic, and fallback behavior.
DynamoDB: Canonical product metadata store for flexible catalog entities and rapid point reads.
EventBridge: Product lifecycle event bus for ProductUpdated, PriceChanged, and InventoryChanged.
SQS: Indexing backlog buffer that protects catalog writes from OpenSearch incidents.
Lambda or ECS Workers: Build denormalized search documents and push bulk updates to OpenSearch.
OpenSearch: Full-text search, faceting, autocomplete, and ranking features.
ElastiCache Redis: Query result cache for hot search pages or autocomplete prefixes.
S3: Snapshot storage, offline reindex inputs, and export data.
CloudWatch and X-Ray: Index lag, query latency, bulk-failure metrics, and traceability.

Request Lifecycle and Data Flow

Clients send search queries through CloudFront and API Gateway.
The search API checks Redis for hot cached responses.
On a miss, the API queries OpenSearch for relevant results and enriches point details from DynamoDB if needed.
Catalog updates write only to the canonical store first.
Change events flow through EventBridge into SQS.
Indexing workers build denormalized search documents and bulk-update OpenSearch asynchronously.
Reindex jobs can be rebuilt from canonical data and S3 snapshots if the index becomes inconsistent.

Production Code Patterns

Bulk indexing worker against OpenSearch

from opensearchpy import OpenSearch, helpers

client = OpenSearch(hosts=[{'host': os.environ['OS_HOST'], 'port': 443}], use_ssl=True)

def index_batch(documents):
    actions = [
        {"_index": "catalog-v3", "_id": doc["productId"], "_source": doc}
        for doc in documents
    ]
    helpers.bulk(client, actions, chunk_size=500, request_timeout=30)

EventBridge rule for product-change indexing

resource "aws_cloudwatch_event_rule" "catalog_changes" {
  name          = "catalog-product-updated"
  event_pattern = jsonencode({
    source      = ["catalog.service"],
    detail-type = ["ProductUpdated", "InventoryChanged", "PriceChanged"]
  })
}

Scaling Strategy

Scale query-serving nodes separately from indexing workers.
Bulk index through SQS to smooth write bursts.
Use OpenSearch shard sizing based on index volume and query concurrency, not default settings.
Keep large product documents minimal; denormalize only search-relevant fields.
Cache hot queries and autocomplete aggressively.

Cost Optimization Techniques

Only index fields that matter for ranking or filtering.
Use DynamoDB point reads for non-search metadata instead of bloating the search document.
Snapshot and restore from S3 rather than overprovisioning for rare rebuilds.
Expire low-value query caches quickly and measure hit rate versus memory cost.

Security Best Practices

Keep OpenSearch in a VPC and front it with an application layer rather than direct public access.
Restrict indexing and query roles separately.
Encrypt data at rest and in transit.
Audit admin reindex and mapping-change operations carefully.

Failure Handling and Resilience

Never make OpenSearch the only copy of product data.
Let SQS absorb indexing outages while catalog writes continue.
Use bulk retry with backoff and failed-document quarantine.
Degrade gracefully to cached results or top-sellers if the search cluster is impaired.
Snapshot indexes regularly to S3.

Trade-offs and Alternatives

OpenSearch is powerful for text and facets, but it adds operational tuning around shards, mappings, and reindexing. If search needs are simple, DynamoDB plus prefix or exact-match indexes may be enough. At scale, dedicated search infrastructure becomes worthwhile.

Real-World Use Case

An Amazon-style product catalog with fast-moving pricing, inventory, and seller attributes fits this architecture well.

How to Build a High-Scale Product Search and Catalog Platform on AWS with OpenSearch, DynamoDB, SQS, and EventBridge

Why Naive Solutions Break

Architecture Overview

Architecture Diagram

Service-by-Service Breakdown

Request Lifecycle and Data Flow

Production Code Patterns

Bulk indexing worker against OpenSearch

EventBridge rule for product-change indexing

Scaling Strategy

Cost Optimization Techniques

Security Best Practices

Failure Handling and Resilience

Trade-offs and Alternatives

Real-World Use Case

Key Interview Insights

Recommended resources

Like this:

Related

Discover more from CheatCoders

Why Naive Solutions Break

Architecture Overview

Architecture Diagram

Service-by-Service Breakdown

Request Lifecycle and Data Flow

Production Code Patterns

Bulk indexing worker against OpenSearch

EventBridge rule for product-change indexing

Scaling Strategy

Cost Optimization Techniques

Security Best Practices

Failure Handling and Resilience

Trade-offs and Alternatives

Real-World Use Case

Key Interview Insights

Recommended resources

🚀 Don’t Miss the Next Cheat Code

Share this:

Like this:

Related

Discover more from CheatCoders