Episode 6 — Scaling Reliability Microservices Web3 / 6.6 — Caching in Production

6.6 — Caching in Production: Quick Revision

Compact cheat sheet. Print-friendly.

How to use this material (instructions)

  1. Skim before labs or interviews.
  2. Drill gaps — reopen README.md6.6.a6.6.c.
  3. Practice6.6-Exercise-Questions.md.
  4. Polish answers6.6-Interview-Questions.md.

Core Vocabulary

TermOne-liner
CacheFast temporary storage layer between the app and the database
RedisIn-memory data structure store — industry-standard cache (port 6379)
TTLTime To Live — seconds until a cached key auto-expires
Cache hitRequested data found in cache — fast response
Cache missRequested data not in cache — must query the database
Hit rate% of requests served from cache (target: > 90%)
Cache-asideApp checks cache, on miss queries DB and stores result in cache
Write-throughEvery write goes to both cache and DB synchronously
Write-behindWrite to cache first, flush to DB asynchronously
Read-throughCache itself fetches from DB on miss (app only talks to cache)
Cache stampedeMany concurrent requests miss an expired key, all hit DB simultaneously
Stale-while-revalidateServe expired data immediately, refresh in background
ETagContent fingerprint enabling 304 Not Modified responses
CDNContent Delivery Network — caches at edge servers globally
Cache warmingPre-loading cache with data before traffic arrives

Redis Commands Cheat Sheet

# Basic operations
SET key value EX 300          # Set with 5-minute TTL
GET key                       # Get value
DEL key                       # Delete key
EXISTS key                    # Check if key exists (0 or 1)
TTL key                       # Remaining TTL in seconds (-1=no expiry, -2=gone)
EXPIRE key 300                # Set/reset TTL on existing key
MGET key1 key2 key3           # Get multiple keys at once

# SET with conditions
SET key value NX EX 30        # Set ONLY if key does NOT exist (lock pattern)
SET key value XX EX 30        # Set ONLY if key DOES exist

# Hashes (field-level access)
HSET user:123 name "Alice" plan "premium"
HGET user:123 plan            # Get single field
HGETALL user:123              # Get all fields
HINCRBY user:123 visits 1     # Atomic increment

# Lists (queues, recent items)
LPUSH notifications:123 "msg" # Push to head
LRANGE notifications:123 0 9  # Get first 10
LTRIM notifications:123 0 99  # Keep only first 100

# Sets (unique collections)
SADD roles:123 "admin" "editor"
SISMEMBER roles:123 "admin"   # Check membership

# Sorted sets (rankings)
ZADD leaderboard 99.5 "user:1"
ZREVRANGE leaderboard 0 9 WITHSCORES  # Top 10

# Pub/sub (invalidation)
PUBLISH cache:invalidation '{"key":"user:123"}'
SUBSCRIBE cache:invalidation

# Batch operations
SCAN 0 MATCH "user:*" COUNT 100  # Iterate keys (NEVER use KEYS in production)

Caching Patterns Comparison

CACHE-ASIDE (most common):
  Read:  Check cache → MISS → Query DB → Store in cache → Return
  Write: Update DB → Delete cache key
  Best:  General-purpose, simple, safe default

WRITE-THROUGH:
  Read:  Always from cache (populated on write)
  Write: Update DB + Update cache (synchronous)
  Best:  Read-heavy, freshness critical, writes are infrequent

WRITE-BEHIND:
  Read:  Always from cache
  Write: Update cache → Async flush to DB (background)
  Best:  Write-heavy, can tolerate brief data loss (counters, analytics)

READ-THROUGH:
  Read:  App calls cache → Cache fetches from DB on miss
  Write: Same as cache-aside
  Best:  Clean abstraction, app never touches DB directly for reads

Invalidation Strategies

TTL-BASED (passive):
  How:    Set EX/TTL on every key. Redis auto-deletes on expiry.
  Pro:    Dead simple, self-healing
  Con:    Data stale up to full TTL duration
  Use:    Safety net on every key, primary strategy for stable data

EVENT-BASED (active):
  How:    DEL key on every write to the source data
  Pro:    Near-instant freshness
  Con:    Must track all affected cache keys, can miss derived data
  Use:    Primary strategy for frequently changing data

VERSION-BASED:
  How:    Include version number in cache key (user:123:v7)
  Pro:    No need to find/delete old keys, atomic version bump
  Con:    Orphaned keys waste memory until TTL
  Use:    CDN-friendly, multi-key entities

IMPORTANT RULES:
  1. Always update DB first, THEN invalidate cache (never reverse)
  2. DELETE the cache key, don't UPDATE it (avoids race conditions)
  3. Always combine event-based with TTL as safety net
  4. Never use KEYS * in production — use SCAN with cursor

TTL Guidelines

By data type:
  Static config / flags     →  5-15 minutes
  User profile              →  5-30 minutes
  Product catalog           →  15-60 minutes
  Product prices            →  1-5 minutes
  Inventory / stock         →  30-60 seconds
  Search results            →  5-15 minutes
  Dashboard analytics       →  1-5 minutes
  Real-time data            →  5-15 seconds
  Sessions                  →  24-48 hours
  Rate limit counters       →  60 seconds (matches window)

By caching layer (inner = shorter):
  Browser (max-age)         →  shortest
  CDN (s-maxage)            →  >= browser TTL
  Redis (EX)                →  >= CDN TTL
  DB query cache            →  managed by DB engine

SLIDING vs FIXED TTL:
  Fixed:    Timer starts on SET, counts down regardless of access
  Sliding:  Timer RESETS on every GET (popular keys stay alive)
  Danger:   Sliding TTL can keep stale data alive forever
  Fix:      Combine sliding TTL with a maximum absolute age

HTTP Cache Headers

Cache-Control directives:
  public              → CDN + browser can cache
  private             → Browser only (user-specific data)
  max-age=N           → Browser TTL (seconds)
  s-maxage=N          → CDN TTL (overrides max-age for shared caches)
  no-cache            → Cache it, but ALWAYS revalidate before using
  no-store            → Never cache (sensitive/real-time data)
  must-revalidate     → After max-age, must revalidate (no stale serving)
  stale-while-revalidate=N  → Serve stale for N seconds while revalidating
  immutable           → Content never changes (hashed static assets)

Common patterns:
  Public API:    Cache-Control: public, max-age=60, s-maxage=300
  User-specific: Cache-Control: private, max-age=60
  Sensitive:     Cache-Control: no-store
  Revalidating:  Cache-Control: no-cache (+ ETag or Last-Modified)
  Static assets: Cache-Control: public, max-age=31536000, immutable

ETag flow:
  1st request:  Server → 200 OK + ETag: "abc123" + body
  2nd request:  Client → If-None-Match: "abc123"
                Server → 304 Not Modified (no body, saves bandwidth)
  Data changed: Server → 200 OK + ETag: "def456" + new body

Last-Modified flow:
  1st request:  Server → 200 OK + Last-Modified: <date> + body
  2nd request:  Client → If-Modified-Since: <date>
                Server → 304 Not Modified (no body)

Cache Stampede Prevention

LOCKING (Mutex):
  1. Cache miss → try SET lock:key NX EX 10
  2. Lock acquired → fetch from DB, populate cache, release lock
  3. Lock NOT acquired → wait 100ms, retry
  Pro: Simple, guaranteed single DB query
  Con: Others wait (latency)

PROBABILISTIC EARLY EXPIRATION:
  1. On each cache hit, check remaining TTL
  2. If TTL is in last 10%, randomly decide to refresh
  3. Higher probability as TTL approaches 0
  Pro: No waiting, gradual refresh
  Con: Occasional extra DB queries

BACKGROUND REFRESH:
  1. Track hot keys
  2. Background job checks TTL periodically
  3. Refresh keys before they expire
  Pro: Keys never actually expire under traffic
  Con: Extra infrastructure, memory for key tracking

Production Caching Layers

┌─────────────────────────────────────────────────┐
│  Client → Browser Cache → CDN → App → Redis → DB │
│           (0ms)          (20ms)      (<1ms) (50-500ms)
│                                                   │
│  Each layer catches requests before the next.     │
│  Goal: > 90% of requests never reach the DB.      │
└─────────────────────────────────────────────────┘

Common Gotchas

GotchaWhy
Caching without TTLStale data persists forever if invalidation has a bug
Deleting cache before updating DBCreates a window where stale DB data gets re-cached
Using KEYS * in productionBlocks Redis, scanning ALL keys — use SCAN instead
Updating cache instead of deletingRace conditions between concurrent writes
Same TTL for all data typesPrices need short TTL, static config can have long TTL
Forgetting derived cache keysUser name changes but team page still shows old name
No graceful degradationRedis goes down = app crashes (should fallback to DB)
Browser cache too aggressiveUsers cannot see updates; you cannot remotely purge a browser cache
CDN caching private datapublic on user-specific responses = data leak to other users
Ignoring cache hit rateLow hit rate means cache is not helping — TTL too short or keys too specific

Node.js Quick Reference

// Connect
import Redis from 'ioredis';
const redis = new Redis({ host: '127.0.0.1', port: 6379 });

// Cache-aside pattern
async function getCached(key, fetchFn, ttl = 300) {
  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);
  const data = await fetchFn();
  if (data) await redis.set(key, JSON.stringify(data), 'EX', ttl);
  return data;
}

// Invalidate
await redis.del('user:123');

// Batch with pipeline
const pipe = redis.pipeline();
pipe.set('k1', 'v1', 'EX', 300);
pipe.set('k2', 'v2', 'EX', 300);
await pipe.exec();

// Stampede-safe fetch
const lock = await redis.set(`lock:${key}`, '1', 'NX', 'EX', 10);
if (lock) { /* fetch and populate */ }
else { /* wait and retry */ }

// Cache middleware (Express)
app.get('/api/data', async (req, res) => {
  const cached = await redis.get(req.originalUrl);
  if (cached) { res.set('X-Cache', 'HIT'); return res.json(JSON.parse(cached)); }
  const data = await fetchFromDB();
  await redis.set(req.originalUrl, JSON.stringify(data), 'EX', 300);
  res.set('X-Cache', 'MISS');
  res.json(data);
});

End of 6.6 quick revision.