Episode 9 — System Design / 9.11 — Real World System Design Problems

9.11 Quick Revision -- Cheat Sheet for All 12 System Designs

Use this file for last-minute review before interviews. Each design is condensed to its essential components, key decisions, and common pitfalls.

Comparison Table: All 12 Systems at a Glance

#	System	Read/Write	Consistency	Real-time?	Core Pattern
9.11.a	URL Shortener	Read-heavy (100:1)	Eventual	No	Key-value lookup + cache
9.11.b	Rate Limiter	Read-heavy	Strong (approx)	Yes	Atomic counters + sliding window
9.11.c	Chat System	Write-heavy	Eventual	Yes	WebSocket + pub/sub
9.11.d	Social Media Feed	Read-heavy	Eventual	Near-RT	Fan-out (hybrid push/pull)
9.11.e	Video Streaming	Read-heavy	Eventual	Yes (live)	CDN + adaptive bitrate
9.11.f	E-Commerce	Mixed	Strong (orders)	No	Microservices + SAGA
9.11.g	Notification Sys.	Write-heavy	At-least-once	Yes	Priority queues + fan-out
9.11.h	File Storage	Mixed	Eventual (sync)	Near-RT	Chunking + dedup + sync
9.11.i	Search Engine	Read-heavy	Eventual	No	Inverted index + ranking
9.11.j	Ride Sharing	Write-heavy (location)	Strong (trips)	Yes	Geospatial index + matching
9.11.k	Payment System	Write-heavy	Strong (ACID)	No	Idempotency + double-entry ledger
9.11.l	Monitoring System	Write-heavy	Eventual	Yes	Time-series DB + alerting

Detailed Dimension Comparison

Data Volume and Throughput

System	Write Rate	Read Rate	Storage (yearly)
URL Shortener	40/sec	4,000/sec	3.6 TB
Rate Limiter	1M checks/sec	1M checks/sec	2 GB (in-memory)
Chat System	230K msgs/sec	Variable	~4 TB
Social Media Feed	5,800 posts/sec	300K feed loads/sec	~20 TB
Video Streaming	500 hrs/min upload	50M concurrent views	400+ PB
E-Commerce	115 orders/sec	5,800 searches/sec	~50 TB
Notification Sys.	116K notifs/sec	N/A (push)	~10 TB
File Storage	1,150 uploads/sec	Variable	50 PB
Search Engine	Crawl: batch	100K queries/sec	10 TB (index)
Ride Sharing	1.25M loc/sec	500K geo queries/sec	~970 TB (loc.)
Payment System	580 TPS	2,900 API calls/sec	~36 TB
Monitoring System	10M points/sec	5K dashboard q/sec	~81 TB (compressed)

Primary Database Choices

System	Primary DB	Cache/Fast Store	Analytics Store
URL Shortener	DynamoDB / Cassandra	Redis	ClickHouse
Rate Limiter	N/A	Redis (Lua scripts)	N/A
Chat System	Cassandra	Redis (sessions)	N/A
Social Media Feed	Cassandra	Redis (feeds, graph)	N/A
Video Streaming	PostgreSQL + S3	CDN	ClickHouse
E-Commerce	PostgreSQL	Redis (cart, catalog)	Elasticsearch
Notification Sys.	PostgreSQL	Redis (dedup)	N/A
File Storage	PostgreSQL (metadata)	Redis (sync state)	N/A
Search Engine	Custom (inverted index)	N/A	N/A
Ride Sharing	PostgreSQL (trips)	Redis GEO (location)	Druid
Payment System	PostgreSQL (ACID)	Redis (idempotency)	N/A
Monitoring System	Custom TSDB	In-memory head block	Elasticsearch (logs)

Per-System Summaries

9.11.a URL Shortener (TinyURL)

Summary:      Read-heavy key-value service with caching and analytics pipeline.
Components:   Pre-Generated Key Service, Redis cache, NoSQL DB, Kafka analytics
Key Decisions:
  - Pre-generated keys over hash-based generation (no collisions)
  - 302 redirect over 301 (for analytics accuracy)
  - Async analytics via Kafka (keeps redirect fast)
Pitfalls:
  - 301 redirect caches in browser, killing analytics accuracy
  - Hash collisions under high volume without collision detection
  - Not caching hot URLs wastes DB capacity on 80/20 workload

9.11.b Rate Limiter

Summary:      Distributed atomic counter system protecting APIs from abuse.
Components:   Redis cluster, Lua scripts, API gateway integration
Key Decisions:
  - Sliding window counter over fixed window (avoids 2x burst at boundary)
  - Redis Lua for atomicity (check + decrement in one round trip)
  - Fail OPEN on Redis failure (never block all traffic)
Pitfalls:
  - Fixed window allows 2x burst at window boundary
  - Fail CLOSED on limiter failure takes down the entire service
  - Not using Lua scripts means race conditions between check and decrement

9.11.c Chat System (WhatsApp)

Summary:      Real-time bidirectional messaging with presence and group support.
Components:   WebSocket servers, Redis (sessions/presence), Cassandra (messages), Kafka
Key Decisions:
  - WebSocket for real-time, REST for message history
  - At-least-once delivery with client-side deduplication
  - Lazy presence (fetch on demand, not broadcast to all contacts)
Pitfalls:
  - Broadcasting presence to all contacts is prohibitively expensive
  - Not handling message ordering across partitions causes out-of-order display
  - Large group fan-out without pub/sub overwhelms individual servers

9.11.d Social Media Feed (Twitter)

Summary:      Timeline generation using hybrid fan-out with ML-based ranking.
Components:   Fan-out service, Feed cache (Redis), Ranking ML model, Cassandra
Key Decisions:
  - Hybrid fan-out: push for regular users, pull for celebrities (> 10K followers)
  - ML-ranked feed over purely chronological
  - Cursor-based pagination (not offset) for real-time insertions
Pitfalls:
  - Pure push fails for users with 50M+ followers (33-minute write storm)
  - Offset pagination breaks when new posts are inserted during scroll
  - Not handling the celebrity problem means one viral post can overwhelm writes

9.11.e Video Streaming (YouTube)

Summary:      Upload pipeline with transcoding, CDN delivery, and adaptive bitrate.
Components:   Chunked upload (S3), GPU transcoding workers, HLS/DASH, 3-tier CDN
Key Decisions:
  - Eager transcoding on upload (not on first play)
  - CDN absorbs 99%+ of streaming bandwidth
  - Approximate view counting (batched, not per-view DB write)
Pitfalls:
  - Lazy transcoding means first viewer waits minutes for encoding
  - Not using adaptive bitrate causes buffering on slow connections
  - Per-view DB writes at 50M concurrent streams would overwhelm any database

9.11.f E-Commerce (Amazon)

Summary:      Microservices architecture with SAGA pattern for distributed transactions.
Components:   Product catalog, Cart (Redis), Inventory, Order, Payment services
Key Decisions:
  - SAGA over 2PC (more resilient, supports eventual consistency)
  - Cart in Redis (sub-ms access, TTL-based expiration)
  - Flash sale: virtual queue + Redis atomic decrement
Pitfalls:
  - 2PC blocks all participants on coordinator failure
  - Not using TTL on inventory reservations leaves phantom holds
  - Flash sale without a virtual queue causes thundering herd to inventory DB

9.11.g Notification System

Summary:      Multi-channel delivery system with priority routing and fan-out.
Components:   Kafka (priority topics), Channel workers (APNS/FCM/email/SMS), templates
Key Decisions:
  - Separate queues per priority per channel (12 total topics)
  - Circuit breaker on providers with automatic failover
  - Per-user rate limits to prevent notification fatigue
Pitfalls:
  - Single queue for all priorities means marketing delays critical alerts
  - No per-user rate limit causes app uninstalls
  - Not deduplicating leads to sending the same notification 3 times on retry

9.11.h File Storage (Dropbox)

Summary:      Chunked file storage with deduplication and cross-device sync.
Components:   CDC chunker (Rabin fingerprint), S3 (blob store), PostgreSQL (metadata)
Key Decisions:
  - Content-Defined Chunking over fixed-size (efficient delta sync)
  - Conflict copy over last-writer-wins (no data loss)
  - Direct-to-S3 upload via presigned URL (bypasses app servers)
Pitfalls:
  - Fixed-size chunking: 1-byte insert invalidates ALL subsequent chunks
  - Last-writer-wins silently loses one user's changes
  - Routing all uploads through app servers creates a bandwidth bottleneck

9.11.i Search Engine (Google)

Summary:      Crawl-index-rank pipeline with inverted index and distributed serving.
Components:   Crawler (priority frontier), Inverted index, Ranking (BM25 + PageRank + ML)
Key Decisions:
  - Document-based sharding over term-based (avoids hot shards)
  - Tiered index: query top-quality pages first, expand if needed
  - SimHash for near-duplicate detection
Pitfalls:
  - Term-based sharding creates hot shards on common terms ("the")
  - Not respecting robots.txt gets your crawler blocked
  - Single-tier index wastes computation ranking low-quality pages

9.11.j Ride Sharing (Uber)

Summary:      Real-time geospatial matching with location ingestion at 1.25M/sec.
Components:   Redis GEO, Matching service, Surge pricing, ETA service, WebSocket
Key Decisions:
  - Redis GEO over custom quadtree (simpler, fast enough)
  - Scored greedy matching over global optimization (speed > optimality)
  - Per-city sharding (trips never cross city boundaries)
Pitfalls:
  - Two concurrent requests matching same driver without atomic locking
  - Not batching location updates overwhelms the network (5M updates/4sec)
  - Stale driver locations (no TTL) match riders to drivers who left the area

9.11.k Payment System (Stripe)

Summary:      Exactly-once financial processing with ledger, fraud detection, PCI.
Components:   Payment state machine, Ledger (double-entry), Fraud engine, Card vault
Key Decisions:
  - Idempotency keys at every boundary (client, PSP, ledger)
  - Outbox pattern for atomic DB update + event publishing
  - Tokenization in isolated PCI environment (merchants never see card numbers)
Pitfalls:
  - Dual-write without outbox pattern loses events or creates inconsistency
  - Storing idempotency keys only in Redis risks loss on restart = double charge
  - Single PSP means one outage stops all payments (use multi-PSP routing)

9.11.l Monitoring System (Datadog)

Summary:      High-throughput metrics ingestion with TSDB, alerting, and tracing.
Components:   Agents, Kafka, custom TSDB (Gorilla compression), Alert evaluators
Key Decisions:
  - Hybrid push/pull collection (push for custom, pull for Prometheus compat.)
  - Gorilla compression: 12x reduction (delta-of-delta + XOR encoding)
  - Downsampling: 15 days full-res, 3 months 1-min, 1 year 5-min
Pitfalls:
  - High cardinality tags (user_id as label) creates millions of time series
  - No downsampling keeps all data forever at 43 TB/day
  - Fixed alert thresholds cause flapping; need hysteresis + seasonal baselines

Cross-Problem Pattern Reference

Pattern                    Problems Where Used
-------------------------------------------------------------------
Caching (Redis)            URL Shortener, Feed, E-Commerce, Search, Ride Sharing
Message Queue (Kafka)      Chat, Notifications, Video, Monitoring, Payment
Database Sharding          URL Shortener, Chat, E-Commerce, Payment, Search
CDN                        Video Streaming, File Storage, E-Commerce
Fan-out                    Feed (hybrid), Chat (groups), Notifications
Idempotency                Payment, E-Commerce, Notifications
Rate Limiting              Rate Limiter, Notifications, URL Shortener, Payment
WebSocket                  Chat, Ride Sharing, File Storage (sync), Monitoring
SAGA Pattern               E-Commerce, Payment (refund flow)
Geospatial Index           Ride Sharing
Inverted Index             Search Engine, Monitoring (label index)
Time-Series Storage        Monitoring
Optimistic Locking         E-Commerce (inventory), Payment (state machine)
Circuit Breaker            Notifications (provider failover), E-Commerce
Pre-generated Keys         URL Shortener
Content-Defined Chunking   File Storage
Sliding Window             Rate Limiter
Outbox Pattern             Payment
Double-Entry Ledger        Payment
Gorilla Compression        Monitoring (TSDB)
Tail-based Sampling        Monitoring (traces)

Read-Heavy vs Write-Heavy Classification

Most Read-Heavy (caching is highly effective)

1. URL Shortener     100:1 read-to-write ratio
   Why caching works: small key-value lookups, 80/20 traffic distribution,
                      data is immutable after creation.

2. Search Engine     Reads dominate (index built in batch)
   Why caching works: popular queries repeat frequently, results change
                      slowly, retrieval phase is expensive to compute.

3. Video Streaming   Millions of views per upload
   Why caching works: CDN caches video segments at edge, popular content
                      follows power law (top 1% of videos = 90% of views).

Most Write-Heavy (append-only/WAL/queues are critical)

1. Monitoring System  10M data points/sec ingested
   Why WAL matters:   Data points must survive writer crashes; WAL provides
                      durability without expensive sync writes per point.

2. Ride Sharing       1.25M location updates/sec
   Why queues matter: Kafka buffers location updates during spikes; Redis
                      TTL ensures stale data auto-expires.

3. Chat System        230K messages/sec
   Why append-only:   Cassandra's append-only log-structured storage handles
                      high write throughput; messages are immutable once sent.

Consistency vs Availability Classification

Strong Consistency Required

Payment System:     Cannot double-charge; ACID transactions mandatory
E-Commerce Orders:  Cannot oversell inventory; optimistic locking on stock
Ride Sharing Trips: Cannot assign driver to two riders simultaneously

Eventual Consistency Acceptable

URL Shortener:      Slightly stale redirect is fine (cache TTL)
Social Media Feed:  5-30 second delay before post appears acceptable
Search Engine:      Index can be minutes behind real-time
Monitoring:         Metrics can tolerate seconds of lag
Chat (messages):    At-least-once with client dedup; order resolved client-side

Real-Time vs Batch Classification

Real-Time Systems (sub-second response required)

Chat System:       Messages must arrive in < 1 second
Ride Sharing:      Location updates every 4 seconds, matching in < 10 seconds
Monitoring:        Alert evaluation within 30 seconds of metric arrival
Notification:      Critical alerts delivered within seconds

Near-Real-Time (seconds to minutes acceptable)

Social Media Feed: 5-30 second delay for new posts
File Storage Sync: Sync within a few seconds across devices

Batch (minutes to hours acceptable)

Video Transcoding: Minutes to transcode after upload
Search Indexing:   Crawl and index cycle is hours/days
Payment Settlement: T+2 day settlement batches
Reconciliation:    Daily batch process

Common Interview Traps and How to Avoid Them

Trap                              How to Avoid
------------------------------------------------------------------
Jumping to database choice        Start with requirements and access patterns
before understanding access       first; the DB choice follows naturally.
patterns.

Saying "use Kafka" without        Explain WHY: decoupling, buffering, durability.
explaining why.                   Name what ordering/delivery guarantees you need.

Designing for Google scale        Start with realistic numbers. 100 users/sec is
on day one.                       fine for an MVP. Show how to scale later.

Ignoring failure modes.           For every component, state what happens when
                                  it fails and how the system degrades gracefully.

"We will just add more            Identify the specific bottleneck (CPU, memory,
servers."                         disk, network) and how sharding/replication
                                  addresses it.

Single point of failure in        Every critical path component needs redundancy:
the architecture.                 load balancer, database replica, queue cluster.

Not stating tradeoffs.            Every decision has a cost. "I chose X because
                                  Y, at the expense of Z" shows engineering maturity.

Final Interview Reminders

1. Always start with requirements clarification (5 minutes).
2. Do capacity estimation early -- it guides database and caching choices.
3. Draw the high-level diagram BEFORE diving into any component.
4. Name your tradeoffs: "I chose X because Y, at the cost of Z."
5. Deep dive into the 2 most interesting or challenging components.
6. Mention failure modes and how you handle them.
7. Think out loud -- the interviewer wants to see your reasoning process.
8. Time management: do not spend 20 minutes on schema and run out of time
   for the deep dive (where most points are awarded).