N
Naveen.dev
Chapter 02
6 min read2026-06-07

Advanced Caching Strategies & Cache Invalidation

Master multi-level caching, cache warming, invalidation patterns, and monitoring cache health.

Caching is one of the most effective ways to reduce latency and database load. However, cache invalidation is notoriously difficult ("There are only two hard things in Computer Science: cache invalidation and naming things").


Multi-Level Caching

Level 1: In-Process Cache

Data stored in application memory (e.g., hashmap).

Pros:

  • Nanosecond access times.
  • No network overhead.

Cons:

  • Lost on application restart.
  • Not shared across instances (if you have 10 app servers, 10 copies of data).
  • Limited by application RAM.

Use Case: Frequently accessed, rarely-changing data (config, feature flags).

Level 2: Distributed Cache (Redis)

Data stored in a central in-memory datastore accessible by all app instances.

Pros:

  • Microsecond access times.
  • Shared across all instances.
  • Survives application restarts.

Cons:

  • Network latency.
  • Requires infrastructure.

Use Case: Session data, computed results, rate limit counters.

Level 3: HTTP Cache (CDN)

Edge servers near users cache responses.

Pros:

  • Millisecond latency (geographic proximity).
  • Reduces origin load.

Cons:

  • Limited to idempotent GET requests.
  • Cache control complex (headers, TTL).

Use Case: Public assets (images, CSS, API responses for read-heavy endpoints).


Cache Invalidation Strategies

1. TTL (Time-To-Live)

Cache expires after a set duration.

cache.set("user:123", user_data, ttl=300)  # 5 minutes

Pros:

  • Simple.
  • Eventually consistent.

Cons:

  • Stale data for up to TTL duration.
  • Incorrect data may be served if update occurs just after cache.

When to Use: Loosely consistent data (recommendations, rankings).

2. Event-Based Invalidation

Explicitly invalidate cache when data changes.

# When user updates profile
cache.delete("user:123")
# or
cache.delete("user:*")  # Invalidate all user keys (expensive)

Pros:

  • Strong consistency.
  • No stale data.

Cons:

  • Must remember to invalidate everywhere.
  • Risk of cache inconsistency if invalidation is forgotten.

When to Use: Critical data (account balance, auth tokens).

3. Tag-Based Invalidation

Group related cache entries with tags.

cache.set("user:123", user_data, tags=["user:123", "users"])
cache.set("user:123:posts", posts, tags=["user:123", "posts"])

# Invalidate all related entries
cache.invalidate_by_tag("user:123")  # Invalidates both

4. Versioned Keys

Include a version in the cache key; change version to invalidate.

# Version 1
cache.set("user:123:v1", user_data)

# Data changed, increment version
cache.set("user:123:v2", new_user_data)

# Old version automatically orphaned

Cache Stampede (Thundering Herd)

Problem: When a cached item expires, many concurrent requests miss the cache and all hit the database simultaneously.

Cache expires for "popular-item"
    ↓
100 concurrent requests → all miss cache
    ↓
All 100 hit database with the same query
    ↓
Database overload / slow response

Solutions

1. Locking (Mutex)

Only one request recomputes; others wait for the result.

def get_data(key):
    value = cache.get(key)
    if value:
        return value
    
    with lock(f"lock:{key}"):
        # Double-check: another thread may have computed it
        value = cache.get(key)
        if value:
            return value
        
        value = expensive_computation()
        cache.set(key, value)
        return value

2. Stale-While-Revalidate

Serve stale data while recomputing in the background.

value = cache.get(key)
if value is None:
    value = expensive_computation()  # Cache miss, compute fresh
elif value.is_stale:
    queue_for_refresh(key)  # Background task recomputes
    return value  # Serve stale data immediately
return value

2. Probabilistic Early Expiration

Expire cache items probabilistically, spreading recomputation over time.

def should_recompute(expires_at):
    time_until_expiry = expires_at - now()
    probability = 1 - (time_until_expiry / total_ttl)
    return random() < probability

Cache Warming

Pre-populate cache before application handles requests.

def warm_cache():
    popular_users = db.query("SELECT * FROM users ORDER BY followers DESC LIMIT 1000")
    for user in popular_users:
        cache.set(f"user:{user.id}", user, ttl=3600)
    
    # Run on application startup
    warm_cache()

Benefit: Avoid cold-start latency spikes.


Monitoring Cache Health

Track:

| Metric | Formula | Target | |---|---|---| | Hit Rate | Hits / (Hits + Misses) | > 80% | | Eviction Rate | Evictions / Time | Should be ~0 | | Memory Usage | Used / Max | < 80% | | P99 Latency | 99th percentile response time | < 10ms |

Alerts:

  • Hit rate drops below 70% (cache not effective)
  • Eviction rate too high (cache too small)
  • Memory usage near max (will start evicting)