Episode 6 — Scaling Reliability Microservices Web3 / 6.6 — Caching in Production
6.6 — Caching in Production: Quick Revision
Compact cheat sheet. Print-friendly.
How to use this material (instructions)
- Skim before labs or interviews.
- Drill gaps — reopen
README.md→6.6.a…6.6.c. - Practice —
6.6-Exercise-Questions.md. - Polish answers —
6.6-Interview-Questions.md.
Core Vocabulary
| Term | One-liner |
|---|---|
| Cache | Fast temporary storage layer between the app and the database |
| Redis | In-memory data structure store — industry-standard cache (port 6379) |
| TTL | Time To Live — seconds until a cached key auto-expires |
| Cache hit | Requested data found in cache — fast response |
| Cache miss | Requested data not in cache — must query the database |
| Hit rate | % of requests served from cache (target: > 90%) |
| Cache-aside | App checks cache, on miss queries DB and stores result in cache |
| Write-through | Every write goes to both cache and DB synchronously |
| Write-behind | Write to cache first, flush to DB asynchronously |
| Read-through | Cache itself fetches from DB on miss (app only talks to cache) |
| Cache stampede | Many concurrent requests miss an expired key, all hit DB simultaneously |
| Stale-while-revalidate | Serve expired data immediately, refresh in background |
| ETag | Content fingerprint enabling 304 Not Modified responses |
| CDN | Content Delivery Network — caches at edge servers globally |
| Cache warming | Pre-loading cache with data before traffic arrives |
Redis Commands Cheat Sheet
# Basic operations
SET key value EX 300 # Set with 5-minute TTL
GET key # Get value
DEL key # Delete key
EXISTS key # Check if key exists (0 or 1)
TTL key # Remaining TTL in seconds (-1=no expiry, -2=gone)
EXPIRE key 300 # Set/reset TTL on existing key
MGET key1 key2 key3 # Get multiple keys at once
# SET with conditions
SET key value NX EX 30 # Set ONLY if key does NOT exist (lock pattern)
SET key value XX EX 30 # Set ONLY if key DOES exist
# Hashes (field-level access)
HSET user:123 name "Alice" plan "premium"
HGET user:123 plan # Get single field
HGETALL user:123 # Get all fields
HINCRBY user:123 visits 1 # Atomic increment
# Lists (queues, recent items)
LPUSH notifications:123 "msg" # Push to head
LRANGE notifications:123 0 9 # Get first 10
LTRIM notifications:123 0 99 # Keep only first 100
# Sets (unique collections)
SADD roles:123 "admin" "editor"
SISMEMBER roles:123 "admin" # Check membership
# Sorted sets (rankings)
ZADD leaderboard 99.5 "user:1"
ZREVRANGE leaderboard 0 9 WITHSCORES # Top 10
# Pub/sub (invalidation)
PUBLISH cache:invalidation '{"key":"user:123"}'
SUBSCRIBE cache:invalidation
# Batch operations
SCAN 0 MATCH "user:*" COUNT 100 # Iterate keys (NEVER use KEYS in production)
Caching Patterns Comparison
CACHE-ASIDE (most common):
Read: Check cache → MISS → Query DB → Store in cache → Return
Write: Update DB → Delete cache key
Best: General-purpose, simple, safe default
WRITE-THROUGH:
Read: Always from cache (populated on write)
Write: Update DB + Update cache (synchronous)
Best: Read-heavy, freshness critical, writes are infrequent
WRITE-BEHIND:
Read: Always from cache
Write: Update cache → Async flush to DB (background)
Best: Write-heavy, can tolerate brief data loss (counters, analytics)
READ-THROUGH:
Read: App calls cache → Cache fetches from DB on miss
Write: Same as cache-aside
Best: Clean abstraction, app never touches DB directly for reads
Invalidation Strategies
TTL-BASED (passive):
How: Set EX/TTL on every key. Redis auto-deletes on expiry.
Pro: Dead simple, self-healing
Con: Data stale up to full TTL duration
Use: Safety net on every key, primary strategy for stable data
EVENT-BASED (active):
How: DEL key on every write to the source data
Pro: Near-instant freshness
Con: Must track all affected cache keys, can miss derived data
Use: Primary strategy for frequently changing data
VERSION-BASED:
How: Include version number in cache key (user:123:v7)
Pro: No need to find/delete old keys, atomic version bump
Con: Orphaned keys waste memory until TTL
Use: CDN-friendly, multi-key entities
IMPORTANT RULES:
1. Always update DB first, THEN invalidate cache (never reverse)
2. DELETE the cache key, don't UPDATE it (avoids race conditions)
3. Always combine event-based with TTL as safety net
4. Never use KEYS * in production — use SCAN with cursor
TTL Guidelines
By data type:
Static config / flags → 5-15 minutes
User profile → 5-30 minutes
Product catalog → 15-60 minutes
Product prices → 1-5 minutes
Inventory / stock → 30-60 seconds
Search results → 5-15 minutes
Dashboard analytics → 1-5 minutes
Real-time data → 5-15 seconds
Sessions → 24-48 hours
Rate limit counters → 60 seconds (matches window)
By caching layer (inner = shorter):
Browser (max-age) → shortest
CDN (s-maxage) → >= browser TTL
Redis (EX) → >= CDN TTL
DB query cache → managed by DB engine
SLIDING vs FIXED TTL:
Fixed: Timer starts on SET, counts down regardless of access
Sliding: Timer RESETS on every GET (popular keys stay alive)
Danger: Sliding TTL can keep stale data alive forever
Fix: Combine sliding TTL with a maximum absolute age
HTTP Cache Headers
Cache-Control directives:
public → CDN + browser can cache
private → Browser only (user-specific data)
max-age=N → Browser TTL (seconds)
s-maxage=N → CDN TTL (overrides max-age for shared caches)
no-cache → Cache it, but ALWAYS revalidate before using
no-store → Never cache (sensitive/real-time data)
must-revalidate → After max-age, must revalidate (no stale serving)
stale-while-revalidate=N → Serve stale for N seconds while revalidating
immutable → Content never changes (hashed static assets)
Common patterns:
Public API: Cache-Control: public, max-age=60, s-maxage=300
User-specific: Cache-Control: private, max-age=60
Sensitive: Cache-Control: no-store
Revalidating: Cache-Control: no-cache (+ ETag or Last-Modified)
Static assets: Cache-Control: public, max-age=31536000, immutable
ETag flow:
1st request: Server → 200 OK + ETag: "abc123" + body
2nd request: Client → If-None-Match: "abc123"
Server → 304 Not Modified (no body, saves bandwidth)
Data changed: Server → 200 OK + ETag: "def456" + new body
Last-Modified flow:
1st request: Server → 200 OK + Last-Modified: <date> + body
2nd request: Client → If-Modified-Since: <date>
Server → 304 Not Modified (no body)
Cache Stampede Prevention
LOCKING (Mutex):
1. Cache miss → try SET lock:key NX EX 10
2. Lock acquired → fetch from DB, populate cache, release lock
3. Lock NOT acquired → wait 100ms, retry
Pro: Simple, guaranteed single DB query
Con: Others wait (latency)
PROBABILISTIC EARLY EXPIRATION:
1. On each cache hit, check remaining TTL
2. If TTL is in last 10%, randomly decide to refresh
3. Higher probability as TTL approaches 0
Pro: No waiting, gradual refresh
Con: Occasional extra DB queries
BACKGROUND REFRESH:
1. Track hot keys
2. Background job checks TTL periodically
3. Refresh keys before they expire
Pro: Keys never actually expire under traffic
Con: Extra infrastructure, memory for key tracking
Production Caching Layers
┌─────────────────────────────────────────────────┐
│ Client → Browser Cache → CDN → App → Redis → DB │
│ (0ms) (20ms) (<1ms) (50-500ms)
│ │
│ Each layer catches requests before the next. │
│ Goal: > 90% of requests never reach the DB. │
└─────────────────────────────────────────────────┘
Common Gotchas
| Gotcha | Why |
|---|---|
| Caching without TTL | Stale data persists forever if invalidation has a bug |
| Deleting cache before updating DB | Creates a window where stale DB data gets re-cached |
Using KEYS * in production | Blocks Redis, scanning ALL keys — use SCAN instead |
| Updating cache instead of deleting | Race conditions between concurrent writes |
| Same TTL for all data types | Prices need short TTL, static config can have long TTL |
| Forgetting derived cache keys | User name changes but team page still shows old name |
| No graceful degradation | Redis goes down = app crashes (should fallback to DB) |
| Browser cache too aggressive | Users cannot see updates; you cannot remotely purge a browser cache |
| CDN caching private data | public on user-specific responses = data leak to other users |
| Ignoring cache hit rate | Low hit rate means cache is not helping — TTL too short or keys too specific |
Node.js Quick Reference
// Connect
import Redis from 'ioredis';
const redis = new Redis({ host: '127.0.0.1', port: 6379 });
// Cache-aside pattern
async function getCached(key, fetchFn, ttl = 300) {
const cached = await redis.get(key);
if (cached) return JSON.parse(cached);
const data = await fetchFn();
if (data) await redis.set(key, JSON.stringify(data), 'EX', ttl);
return data;
}
// Invalidate
await redis.del('user:123');
// Batch with pipeline
const pipe = redis.pipeline();
pipe.set('k1', 'v1', 'EX', 300);
pipe.set('k2', 'v2', 'EX', 300);
await pipe.exec();
// Stampede-safe fetch
const lock = await redis.set(`lock:${key}`, '1', 'NX', 'EX', 10);
if (lock) { /* fetch and populate */ }
else { /* wait and retry */ }
// Cache middleware (Express)
app.get('/api/data', async (req, res) => {
const cached = await redis.get(req.originalUrl);
if (cached) { res.set('X-Cache', 'HIT'); return res.json(JSON.parse(cached)); }
const data = await fetchFromDB();
await redis.set(req.originalUrl, JSON.stringify(data), 'EX', 300);
res.set('X-Cache', 'MISS');
res.json(data);
});
End of 6.6 quick revision.