Episode 9 — System Design / 9.8 — Communication and Data Layer

9.8 — Quick Revision (Cheat Sheet)

How to use this material (instructions)

Use this for quick review before interviews, not as a primary learning resource.
If any item feels unfamiliar, go back to the corresponding subtopic file.
Print this or keep it open on a second screen during mock interviews.

1. Protocol Comparison

Protocol	Transport	Connection	Direction	Best For
HTTP/1.1	TCP	Persistent (keep-alive)	Request-Response	Standard web, REST APIs
HTTP/2	TCP	Multiplexed streams	Request-Response	Modern web (multiple resources)
HTTP/3	QUIC (UDP)	Multiplexed, 0-RTT	Request-Response	Low-latency web
WebSocket	TCP	Persistent, full-duplex	Bidirectional	Chat, live updates, gaming
SSE	HTTP	Persistent, one-way	Server -> Client	Notifications, live feeds
gRPC	HTTP/2	Multiplexed	4 patterns (unary, streaming)	Microservice-to-microservice

2. HTTP Methods

Method	Action	Idempotent	Safe	Body
GET	Read	Yes	Yes	No
POST	Create	No	No	Yes
PUT	Replace	Yes	No	Yes
PATCH	Update	No*	No	Yes
DELETE	Remove	Yes	No	No

3. Must-Know Status Codes

Code	Meaning	When
200	OK	Successful GET/PUT/PATCH
201	Created	Successful POST
204	No Content	Successful DELETE
301	Moved Permanently	URL changed
304	Not Modified	Cache still valid
400	Bad Request	Validation error
401	Unauthorized	No auth provided
403	Forbidden	Auth OK, no permission
404	Not Found	Resource missing
409	Conflict	Duplicate, state conflict
429	Too Many Requests	Rate limited
500	Internal Server Error	Server bug
502	Bad Gateway	Upstream failure
503	Service Unavailable	Overloaded

4. REST vs GraphQL

Factor	REST	GraphQL
Endpoints	Many	One (`/graphql`)
Data shape	Server decides	Client decides
Over-fetching	Common	Eliminated
Under-fetching	Common	Eliminated
Caching	Easy (HTTP)	Hard (all POST)
Best for	Simple CRUD, public APIs	Complex queries, mobile

5. Pagination

Type	Jump to Page N?	Consistent Under Writes?	Deep Offset Perf
Offset	Yes	No	Degrades
Cursor	No	Yes	Constant

6. SQL vs NoSQL

Factor	SQL	NoSQL
Schema	Fixed, predefined	Flexible, dynamic
Relationships	Joins (strong)	Embedded/denormalized
Transactions	Multi-table ACID	Usually single-document
Scaling	Vertical (+ read replicas)	Horizontal (sharding)
Query	Ad-hoc SQL	Limited to access patterns
Best for	Financial, relational data	Flexible schemas, massive scale

7. ACID Properties

Property	Meaning	One-Liner
Atomicity	All or nothing	Transaction fully commits or fully rolls back
Consistency	Valid state to valid state	Constraints always enforced
Isolation	Concurrent txns do not interfere	One transaction cannot see another's uncommitted changes
Durability	Survives crashes	Once committed, data is on disk

8. CAP Theorem

  Network partitions WILL happen -> Choose CP or AP

  CP: Consistency + Partition Tolerance (refuse stale reads)
      Examples: MongoDB (default), HBase, Zookeeper
      Use: Banking, inventory, leader election

  AP: Availability + Partition Tolerance (serve stale data)
      Examples: Cassandra, DynamoDB (default), CouchDB
      Use: Social feeds, shopping carts, analytics

Quorum formula: R + W > N = strong consistency

9. Database Selection Guide

Need	Database	Type
General purpose, complex queries	PostgreSQL	SQL
Flexible schema, rapid development	MongoDB	Document
Sub-ms caching, sessions	Redis	Key-Value
Serverless, predictable perf	DynamoDB	Key-Value/Doc
Massive writes, time-series	Cassandra	Column-Family
Relationship traversal	Neo4j	Graph

10. NoSQL Types

Type	Model	Example	Use Case
Document	JSON documents	MongoDB, CouchDB	Content, catalogs, profiles
Key-Value	key -> value	Redis, DynamoDB	Cache, sessions, config
Column-Family	Row key -> column families	Cassandra, HBase	IoT, time-series, logs
Graph	Nodes + edges	Neo4j, Neptune	Social networks, fraud, recommendations

11. Database Scaling Ladder

  Step 1: Optimize queries + indexes (free performance)
       |
  Step 2: Add read replicas (handle read-heavy load)
       |
  Step 3: Add caching layer - Redis (reduce DB load)
       |
  Step 4: Vertical scaling (bigger machine)
       |
  Step 5: Sharding (horizontal partitioning)

12. Sharding Strategies

Strategy	Distribution	Range Queries	Resharding	Best For
Range	Uneven (hotspots)	Single shard	Medium	Naturally ordered data
Hash	Even	All shards	Expensive (use consistent hashing)	Most cases
Geographic	By region	Within region	Medium	Multi-region, compliance

Shard key must have: High cardinality, even distribution, matches query patterns, immutable.

13. Replication

  Primary (writes) --async--> Replica 1 (reads)
                   --async--> Replica 2 (reads)
                   --sync---> Replica 3 (reads, failover candidate)

Type	Write Speed	Data Loss Risk	Use
Async	Fast	Some (lag)	Most apps
Sync	Slower	None	Critical data
Semi-sync	Balanced	Minimal	Production default

14. Consistency Models

  Strong -----> Sequential -----> Causal -----> Session -----> Eventual
  (slowest,     (total order)    (cause before  (read-your-   (fastest,
   safest)                        effect)        writes)        least safe)

Data Type	Consistency Needed
Bank balance	Strong
Inventory (low stock)	Strong
Social media likes	Eventual
User profile (own view)	Session (read-your-writes)
Analytics	Eventual
Payment processing	Strong

15. Distributed Transactions

Pattern	Blocking?	Consistency	Best For
2PC	Yes (holds locks)	Strong	Within single DB system
Saga (Choreography)	No	Eventual	Simple 2-4 step flows
Saga (Orchestration)	No	Eventual	Complex 5+ step flows
Outbox Pattern	No	Eventual	Reliable event publishing

16. Saga Compensation

  Forward: Step 1 -> Step 2 -> Step 3 -> Step 4
  
  If Step 3 fails:
  Compensate: Undo Step 2 -> Undo Step 1
  
  Each step needs a reversing "compensating transaction"

17. Connection Pooling

  Formula: connections = (cores * 2) + spindles
  Typical: 10-20 per app server (not 100+)
  Tools: PgBouncer (PostgreSQL), ProxySQL (MySQL), HikariCP (Java)

18. Indexing Quick Reference

Index	Best For	Not For
B-Tree (default)	`=`, `<`, `>`, `BETWEEN`, `ORDER BY`	Full-text, spatial
Hash	Exact `=` only	Range queries
GIN	Full-text, JSONB, arrays	Simple equality
BRIN	Large, naturally ordered tables	Random access

Composite index rule: Leftmost prefix. Index on (A, B, C) supports WHERE A=?, WHERE A=? AND B=?, but NOT WHERE B=? alone.

19. Latency Numbers

Operation	Time
L1 cache	0.5 ns
L2 cache	7 ns
RAM	100 ns
SSD read	16 us
HDD read	2 ms
Same datacenter RTT	0.5 ms
Same region RTT	1-5 ms
Cross-continent RTT	80-150 ms

20. CDN Basics

  User -> CDN Edge (nearby)
            |
            +-- HIT: return cached content (fast)
            +-- MISS: fetch from origin, cache it, return

Cache: static assets (long TTL), HTML (short TTL), API responses (sometimes), user data (rarely).

21. Rate Limiting Algorithms

Algorithm	Behavior	Bursts Allowed?
Fixed Window	Count per time window	Yes (at boundary)
Sliding Window	Count in rolling window	Smoothed
Token Bucket	Tokens added at fixed rate	Yes (up to bucket size)
Leaky Bucket	Process at fixed rate	No (smooths everything)

Headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After Status code: 429 Too Many Requests

22. CRDTs (One-Liner)

Data structures that replicas can update independently and always merge without conflicts. Used for: counters (G-Counter), sets (OR-Set), registers (LWW-Register). Used by: Figma, Riak, Redis Enterprise.

23. Interview Decision Framework

When asked "which database/protocol/consistency model would you use?":

State the requirements (read/write ratio, latency, consistency, scale)
Name your choice with a specific database/technology
Justify with tradeoffs ("I chose X because... the tradeoff is...")
Mention what you would NOT use and why

"I would use PostgreSQL for the order data because we need ACID transactions for payment integrity. The tradeoff is that horizontal scaling is harder than with DynamoDB, but at our expected volume of 10K writes/sec, a vertically scaled PostgreSQL with read replicas is sufficient."

Use this sheet for final review. For deeper understanding, revisit the subtopic files: 9.8.a | 9.8.b | 9.8.c | 9.8.d | 9.8.e