MongoDB
Designing a Multi-Level Cache Architecture
Design a hierarchical caching system — L1 in-process cache, L2 distributed cache, L3 CDN — with coherence strategies and eviction policies.
S
srikanthtelkalapally888@gmail.com
Designing a Multi-Level Cache Architecture
A hierarchical cache reduces latency and database load by placing caches at multiple levels.
Cache Hierarchy
Request
↓
L1: In-Process Cache (memory, <1ms)
↓ miss
L2: Distributed Cache (Redis, 1-5ms)
↓ miss
L3: CDN Edge Cache (5-30ms)
↓ miss
Origin Database (10-100ms)
L1: In-Process Cache
In-memory cache within each application instance.
# Caffeine (Java) / functools.lru_cache (Python)
@lru_cache(maxsize=1000, ttl=60) # 1000 entries, 60s TTL
def get_user(user_id):
return redis.get(f'user:{user_id}') or db.query(user_id)
Pros: Fastest (<1ms), no network hop Cons: Not shared across instances, limited size, consistency risk
L2: Distributed Cache (Redis)
Shared across all application instances.
def get_product(product_id):
key = f'product:{product_id}'
cached = redis.get(key)
if cached:
return json.loads(cached)
data = db.query('SELECT * FROM products WHERE id = ?', product_id)
redis.setex(key, 3600, json.dumps(data)) # 1 hour TTL
return data
L3: CDN Cache
Geographically distributed, serves static + semi-static responses.
Cache-Control: public, max-age=300, stale-while-revalidate=60
Vary: Accept-Language
Cache Coherence Problem
Update product price in DB:
1. Update DB: price = $49.99
2. Must invalidate: CDN, Redis, all L1 caches
Strategies:
a. TTL-based expiry (simple, eventual consistency)
b. Event-driven invalidation via Kafka
c. Cache-aside with write-through
Event-Driven Invalidation
DB write
↓
Change Data Capture (Debezium)
↓
Kafka: product_updated event
↓
Cache Invalidation Service
→ DELETE redis:product:123
→ Purge CDN: POST /purge?url=/products/123
→ L1 expires on next poll (30s max)
Hit Rate Optimization
Target: >95% L1 hit rate for hot data
Principles:
- Cache only hot data (LRU eviction)
- Warm up caches on deployment
- Pre-populate on anticipated spikes
Conclusion
Multi-level caches reduce p99 latency by 10-100x. L1 for hot local data, L2 for shared consistency, CDN for global geographic distribution.