Advanced Caching Strategies & Cache Invalidation
Master multi-level caching, cache warming, invalidation patterns, and monitoring cache health.
Caching is one of the most effective ways to reduce latency and database load. However, cache invalidation is notoriously difficult ("There are only two hard things in Computer Science: cache invalidation and naming things").
Multi-Level Caching
Level 1: In-Process Cache
Data stored in application memory (e.g., hashmap).
Pros:
- Nanosecond access times.
- No network overhead.
Cons:
- Lost on application restart.
- Not shared across instances (if you have 10 app servers, 10 copies of data).
- Limited by application RAM.
Use Case: Frequently accessed, rarely-changing data (config, feature flags).
Level 2: Distributed Cache (Redis)
Data stored in a central in-memory datastore accessible by all app instances.
Pros:
- Microsecond access times.
- Shared across all instances.
- Survives application restarts.
Cons:
- Network latency.
- Requires infrastructure.
Use Case: Session data, computed results, rate limit counters.
Level 3: HTTP Cache (CDN)
Edge servers near users cache responses.
Pros:
- Millisecond latency (geographic proximity).
- Reduces origin load.
Cons:
- Limited to idempotent GET requests.
- Cache control complex (headers, TTL).
Use Case: Public assets (images, CSS, API responses for read-heavy endpoints).
Cache Invalidation Strategies
1. TTL (Time-To-Live)
Cache expires after a set duration.
cache.set("user:123", user_data, ttl=300) # 5 minutes
Pros:
- Simple.
- Eventually consistent.
Cons:
- Stale data for up to TTL duration.
- Incorrect data may be served if update occurs just after cache.
When to Use: Loosely consistent data (recommendations, rankings).
2. Event-Based Invalidation
Explicitly invalidate cache when data changes.
# When user updates profile
cache.delete("user:123")
# or
cache.delete("user:*") # Invalidate all user keys (expensive)
Pros:
- Strong consistency.
- No stale data.
Cons:
- Must remember to invalidate everywhere.
- Risk of cache inconsistency if invalidation is forgotten.
When to Use: Critical data (account balance, auth tokens).
3. Tag-Based Invalidation
Group related cache entries with tags.
cache.set("user:123", user_data, tags=["user:123", "users"])
cache.set("user:123:posts", posts, tags=["user:123", "posts"])
# Invalidate all related entries
cache.invalidate_by_tag("user:123") # Invalidates both
4. Versioned Keys
Include a version in the cache key; change version to invalidate.
# Version 1
cache.set("user:123:v1", user_data)
# Data changed, increment version
cache.set("user:123:v2", new_user_data)
# Old version automatically orphaned
Cache Stampede (Thundering Herd)
Problem: When a cached item expires, many concurrent requests miss the cache and all hit the database simultaneously.
Cache expires for "popular-item"
↓
100 concurrent requests → all miss cache
↓
All 100 hit database with the same query
↓
Database overload / slow response
Solutions
1. Locking (Mutex)
Only one request recomputes; others wait for the result.
def get_data(key):
value = cache.get(key)
if value:
return value
with lock(f"lock:{key}"):
# Double-check: another thread may have computed it
value = cache.get(key)
if value:
return value
value = expensive_computation()
cache.set(key, value)
return value
2. Stale-While-Revalidate
Serve stale data while recomputing in the background.
value = cache.get(key)
if value is None:
value = expensive_computation() # Cache miss, compute fresh
elif value.is_stale:
queue_for_refresh(key) # Background task recomputes
return value # Serve stale data immediately
return value
2. Probabilistic Early Expiration
Expire cache items probabilistically, spreading recomputation over time.
def should_recompute(expires_at):
time_until_expiry = expires_at - now()
probability = 1 - (time_until_expiry / total_ttl)
return random() < probability
Cache Warming
Pre-populate cache before application handles requests.
def warm_cache():
popular_users = db.query("SELECT * FROM users ORDER BY followers DESC LIMIT 1000")
for user in popular_users:
cache.set(f"user:{user.id}", user, ttl=3600)
# Run on application startup
warm_cache()
Benefit: Avoid cold-start latency spikes.
Monitoring Cache Health
Track:
| Metric | Formula | Target | |---|---|---| | Hit Rate | Hits / (Hits + Misses) | > 80% | | Eviction Rate | Evictions / Time | Should be ~0 | | Memory Usage | Used / Max | < 80% | | P99 Latency | 99th percentile response time | < 10ms |
Alerts:
- Hit rate drops below 70% (cache not effective)
- Eviction rate too high (cache too small)
- Memory usage near max (will start evicting)