Episode 9 — System Design / 9.8 — Communication and Data Layer

9.8 — Exercise Questions

How to use this material (instructions)

Attempt every question on paper or a blank editor before looking at hints.
For coding questions, write actual code — do not just describe the approach.
For design questions, draw ASCII diagrams.
Time yourself: aim for 3-5 minutes per short-answer question, 10-15 minutes per design question.
After completing a section, revisit the corresponding subtopic file to fill knowledge gaps.

Section A: Networking Basics (9.8.a)

A1. List the steps that occur between a user typing https://app.example.com/dashboard in a browser and seeing the page. Include DNS, TCP, TLS, and HTTP steps.

A2. Explain the difference between latency, bandwidth, and throughput. Give a real-world analogy for each.

A3. A data center in US-East sends a request to a server in EU-West. Estimate the minimum round-trip time and explain why it cannot be reduced below a certain threshold.

A4. Compare TCP and UDP across these dimensions: connection setup, reliability, ordering, header size, and typical use cases. Present your answer as a table.

A5. Explain why online multiplayer games often use UDP instead of TCP. What happens when a UDP packet is lost during gameplay?

A6. Draw an ASCII diagram showing how a CDN serves a static image for a user in Tokyo when the origin server is in US-East. Show both cache hit and cache miss scenarios.

A7. What is the difference between HTTP/1.1, HTTP/2, and HTTP/3? Why was each version created?

A8. Explain DNS caching at four levels: browser, operating system, ISP, and recursive resolver. Why is the TTL (Time-To-Live) important?

A9. Compare WebSockets, Server-Sent Events (SSE), and long polling. When would you choose each?

A10. What is gRPC? How does it differ from REST? When would you choose gRPC over REST for microservice communication?

A11. Explain the TLS handshake in 5 steps. Why does HTTPS add latency compared to HTTP?

A12. What is a service mesh? Draw an ASCII diagram showing sidecar proxies in a microservice architecture.

Section B: API Design (9.8.b)

B1. Convert these action-oriented endpoints into proper RESTful resource-oriented endpoints:

POST /createUser
GET /getUserById?id=42
POST /updateUserEmail
POST /deleteUser
GET /getUserOrders?userId=42

B2. Design a complete REST API for a library management system. Define endpoints for:

Books (CRUD, search, check availability)
Members (CRUD, view borrowing history)
Borrowing (borrow a book, return a book, extend due date)

Include HTTP methods, URLs, request bodies, and expected status codes.

B3. Explain the difference between 401 (Unauthorized) and 403 (Forbidden). Give a real-world example where each would be returned.

B4. Your API returns a list of 10 million products. Design pagination for two scenarios:

An admin dashboard where users can jump to page 500
A mobile app with infinite scroll Which pagination strategy would you use for each and why?

B5. Write a rate limiting response that includes proper headers. What HTTP status code should be returned? What headers should be included?

B6. Compare URL path versioning (/v1/users), query parameter versioning (/users?version=1), and header versioning. Which would you recommend for a public API and why?

B7. Design an error response format for an API. Include at least: error code, human-readable message, field-level validation errors, and a request ID. Show a JSON example.

B8. Explain the N+1 problem in REST APIs. How does GraphQL solve it? Give a concrete example.

B9. Your API endpoint GET /api/v1/users/42 needs to return user data along with their 5 most recent orders and 3 closest friends. Compare:

REST approach (how many requests?)
GraphQL approach (show the query)

B10. Write an OpenAPI (Swagger) specification for a POST /api/v1/orders endpoint that creates a new order. Include request body schema, response schema, and error responses.

Section C: SQL vs NoSQL (9.8.c)

C1. Explain the four ACID properties using a bank transfer example. What goes wrong if any one property is violated?

C2. Given this unnormalized table, normalize it to 3NF:

| order_id | customer_name | customer_email    | customer_city | product_name | product_price | qty |
|----------|---------------|-------------------|---------------|-------------|---------------|-----|
| 1        | Alice         | alice@mail.com    | New York      | Laptop      | 999           | 1   |
| 2        | Alice         | alice@mail.com    | New York      | Mouse       | 29            | 2   |
| 3        | Bob           | bob@mail.com      | Chicago       | Laptop      | 999           | 1   |

Show the resulting tables with their columns and relationships.

C3. Explain the CAP theorem. Why can a distributed database not provide all three guarantees simultaneously? Give an example of a CP system and an AP system.

C4. For each scenario, recommend SQL or NoSQL (and which specific database) with justification:

A banking application with complex transactions
A social media activity feed
A session store for a web application
A product catalog with varying attributes
A real-time recommendation engine based on user connections
IoT sensor data from 1 million devices

C5. Compare MongoDB and PostgreSQL across: schema flexibility, transactions, scaling model, query language, joins, and typical use cases.

C6. Explain DynamoDB's single-table design. Given a social media app, design a DynamoDB table that stores users, posts, and comments using partition key + sort key patterns.

C7. Write a SQL query using JOINs to find: "All customers who have placed more than 3 orders in the last 30 days, along with their total spending." Assume tables: customers(id, name, email) and orders(id, customer_id, total, created_at).

C8. Explain the four types of NoSQL databases (document, key-value, column-family, graph). For each, name a database, describe its data model, and give two use cases.

C9. What is the difference between Read Committed and Repeatable Read isolation levels? Give a scenario where choosing the wrong level causes a bug.

C10. You are storing user sessions. Compare using Redis vs DynamoDB vs PostgreSQL. Consider: latency, TTL support, scaling, cost, and operational complexity.

Section D: Database Scaling (9.8.d)

D1. Your PostgreSQL database is handling 5,000 queries per second and average query time has increased from 5ms to 200ms. List 5 things you would investigate before adding hardware.

D2. Draw an ASCII diagram showing a primary database with 3 read replicas. Show the write path and read path. Label which connections are synchronous vs asynchronous.

D3. Explain replication lag. Give a concrete scenario where replication lag causes a bug visible to a user. How would you solve it?

D4. Compare range-based sharding, hash-based sharding, and geographic sharding. Present as a table with: data distribution, range query support, resharding difficulty, and best use case.

D5. You need to shard a messages table for a chat application. The table has columns: id, conversation_id, sender_id, text, created_at. Evaluate these potential shard keys:

id (message ID)
sender_id
conversation_id Which would you choose and why?

D6. Explain consistent hashing. Draw a hash ring diagram showing 4 nodes and 5 keys. Then show what happens when a node is added.

D7. Give three examples of denormalization and explain the tradeoff for each:

Embedding related data
Duplicating a field across tables
Pre-computing an aggregate

D8. Explain connection pooling. Why does a pool of 20 connections often outperform 200 connections? What are the key configuration parameters?

D9. Write SQL to create appropriate indexes for these queries:

-- Query 1
SELECT * FROM orders WHERE user_id = ? AND status = 'pending' ORDER BY created_at DESC;
-- Query 2
SELECT * FROM products WHERE category = ? AND price BETWEEN ? AND ?;
-- Query 3
SELECT user_id, COUNT(*) FROM orders GROUP BY user_id HAVING COUNT(*) > 10;

D10. Your e-commerce database has 100 million orders across 4 shards (hash-based on user_id). A product manager asks for a report: "total revenue per product category for the last quarter." Explain why this is hard with sharding and propose a solution.

Section E: Data Consistency (9.8.e)

E1. Explain strong consistency vs eventual consistency using a social media "like" button as an example. Does it matter if the like count is temporarily wrong?

E2. Using the quorum formula (R + W > N), calculate the minimum values of R and W needed for strong consistency when N = 5. Show all valid combinations.

E3. Draw an ASCII diagram of the Two-Phase Commit protocol. Show both the success case and the failure case (when one participant votes NO).

E4. Design a saga for an airline booking system that involves:

Reserve seat
Charge payment
Issue ticket
Send confirmation email

Define the compensating transaction for each step. What happens if step 3 fails?

E5. Compare choreography-based sagas vs orchestration-based sagas. When would you prefer each? Draw a diagram for each approach.

E6. Explain CRDTs (Conflict-Free Replicated Data Types). How does a G-Counter work? Why does it never have merge conflicts?

E7. A user updates their profile name on their phone (connected to US data center) and simultaneously updates their bio on their laptop (connected to EU data center). How would you handle this conflict with:

Last-Writer-Wins
CRDTs
Application-level conflict resolution

E8. For each of the following, state whether you need strong or eventual consistency and justify:

User's bank account balance
Number of views on a YouTube video
Inventory count for a product with 3 remaining
A user's "last seen" status in a messaging app
The assignment of a driver to a ride request

E9. Explain the read-your-writes consistency model. How would you implement it in a system with 1 primary and 3 read replicas?

E10. Your microservice architecture has these services: Order, Payment, Inventory, Shipping. An order placement spans all four. Compare using 2PC vs a Saga. Which would you choose and why?

Section F: Cross-Topic Integration Questions

F1. Design the complete data flow for a "tweet" in a Twitter-like system:

User composes tweet (API design)
Tweet is stored (database choice)
Tweet appears in followers' feeds (consistency model)
Tweet is visible in search (eventual consistency) Draw the architecture and label each component.

F2. A startup is building a real-time collaborative document editor (like Google Docs). For each concern, recommend a specific technology/pattern:

Communication protocol (client <-> server)
Database for document storage
Conflict resolution strategy
Scaling to 1 million concurrent documents

F3. Design the backend for a flash sale (e.g., 1000 limited items, 100,000 users trying to buy at once):

What API design would you use?
What database and consistency model?
How would you prevent overselling?
How would you handle the traffic spike?

F4. You run a global e-commerce platform. Users in India experience 3-second page loads while US users see 200ms. Diagnose the issue and propose fixes at each layer: DNS, CDN, API, database.

F5. Compare the trade-offs for storing user sessions using:

In-memory (on app server)
Redis
DynamoDB
PostgreSQL
JWT (client-side)

Create a comparison table covering: latency, scalability, persistence, cost, and complexity.

Total: 47 questions across all subtopics.

After completing these exercises, review your weak areas using the corresponding subtopic files, then attempt the Interview Questions.