Episode 9 — System Design / 9.7 — System Design Foundations
9.7.a — What Is High-Level Design (HLD)?
In one sentence: High-Level Design is the process of defining a system's architecture — which services exist, how they talk to each other, what data stores they use, and how the whole thing scales — before writing a single line of code.
Navigation: ← 9.7 Overview · 9.7.b — Requirements Analysis →
Table of Contents
- 1. What Is HLD?
- 2. HLD vs LLD — The Two Halves of System Design
- 3. Components of HLD
- 4. Architecture Diagrams
- 5. HLD in Interviews
- 6. The Whiteboard Approach
- 7. Common Architecture Patterns
- 8. Key Takeaways
- 9. Explain-It Challenge
1. What Is HLD?
High-Level Design (HLD) is a bird's-eye view of a software system. It answers the fundamental question: "How is this system organized?"
An HLD defines:
- What services (or components) the system is made of
- How those services communicate with each other
- What databases, caches, and queues are used
- How the system handles scale, failures, and consistency
- What trade-offs were made and why
| What HLD IS | What HLD IS NOT |
|---|---|
| Architecture of the entire system | Implementation details of one module |
| Service boundaries and responsibilities | Class diagrams or method signatures |
| Data flow between components | Algorithm complexity analysis |
| Technology choices (SQL vs NoSQL, REST vs gRPC) | Language-specific code |
| Scaling and reliability strategy | Unit test coverage |
2. HLD vs LLD — The Two Halves of System Design
┌────────────────────────────────────────────────────────────────┐
│ │
│ "Design Twitter" │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ HLD (This section) │ │
│ │ ──────────────────── │ │
│ │ • Tweet Service, Timeline Service, User Service │ │
│ │ • Redis cache for timelines │ │
│ │ • Kafka for fan-out │ │
│ │ • CDN for media │ │
│ │ • Load balancers at every tier │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LLD (Covered in 9.1–9.6) │ │
│ │ ───────────────────────── │ │
│ │ • TweetService class with createTweet(), deleteTweet() │ │
│ │ • Observer pattern for notifications │ │
│ │ • Strategy pattern for feed ranking │ │
│ │ • Repository pattern for data access │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
| Aspect | HLD | LLD |
|---|---|---|
| Question | "What boxes go on the whiteboard?" | "What classes go inside each box?" |
| Granularity | Services, databases, network | Classes, methods, interfaces |
| Trade-offs | CAP theorem, latency vs consistency | Coupling vs cohesion, pattern choice |
| Diagram type | Architecture / data-flow | Class / sequence / UML |
| Interview trigger | "Design X at scale" | "Design classes for Y" |
3. Components of HLD
Every high-level design is assembled from a toolkit of fundamental building blocks. Understanding each one is essential.
3.1 Services (Application Servers)
Services are the workers of your system. Each service owns a specific responsibility.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ User Service │ │ Tweet Service│ │Timeline Svc │
│ │ │ │ │ │
│ • signup() │ │ • create() │ │ • getFeed() │
│ • login() │ │ • delete() │ │ • refresh() │
│ • getUser() │ │ • like() │ │ • paginate() │
└──────────────┘ └──────────────┘ └──────────────┘
Key decisions:
- Monolith vs Microservices — start monolith, split when needed
- Stateless vs Stateful — stateless services scale horizontally; state lives in databases/caches
- Synchronous vs Asynchronous — REST/gRPC for sync, message queues for async
3.2 Databases
Databases are the persistent memory of your system.
| Type | Examples | Best For |
|---|---|---|
| Relational (SQL) | PostgreSQL, MySQL | Structured data, transactions, joins, ACID guarantees |
| Document (NoSQL) | MongoDB, DynamoDB | Flexible schemas, nested objects, rapid iteration |
| Wide-column | Cassandra, HBase | Time-series, write-heavy workloads, massive scale |
| Key-value | Redis, DynamoDB | Lookups by key, sessions, counters, caching |
| Graph | Neo4j, Amazon Neptune | Relationships: social graphs, recommendations |
| Search | Elasticsearch, Solr | Full-text search, log analytics |
Key decisions:
- SQL vs NoSQL (consistency vs flexibility)
- Read replicas vs sharding
- Single-leader vs multi-leader replication
3.3 Caches
Caches store frequently accessed data in memory to reduce latency and database load.
Client Request
│
▼
┌─────────┐ Cache HIT ┌─────────┐
│ Server │ ─────────────────► │ Cache │ (Redis / Memcached)
│ │ │ (RAM) │
│ │ Cache MISS │ │
│ │ ◄───── miss ────── │ │
│ │ └─────────┘
│ │ │
│ │ ▼
│ │ ┌──────────┐
│ │ │ Database │
│ │ └──────────┘
│ │ │
│ │ Write to cache + return
└─────────┘
Caching strategies:
- Cache-aside (lazy loading) — app checks cache first, fills on miss
- Write-through — every write goes to cache AND database
- Write-behind — write to cache, async flush to database
- TTL-based expiry — entries auto-expire after N seconds
3.4 Message Queues
Queues decouple producers from consumers and enable asynchronous processing.
┌──────────┐ ┌────────────────┐ ┌──────────┐
│ Producer │────►│ Message Queue │────►│ Consumer │
│ (API) │ │ (Kafka/SQS/ │ │ (Worker) │
│ │ │ RabbitMQ) │ │ │
└──────────┘ └────────────────┘ └──────────┘
Benefits:
• Producer doesn't wait for consumer (async)
• Consumer can process at its own pace (backpressure)
• Failed messages can be retried (reliability)
• Multiple consumers can process in parallel (scaling)
When to use queues:
- Sending emails/notifications (don't block the API response)
- Processing uploads (video transcoding, image resizing)
- Fan-out (one event triggers many downstream actions)
- Rate limiting / smoothing traffic spikes
3.5 Load Balancers
Load balancers distribute incoming traffic across multiple server instances.
┌──────────────┐
│ Clients │
└──────┬───────┘
│
┌──────▼───────┐
│ Load │
│ Balancer │
└──┬───┬───┬──┘
│ │ │
┌──────┘ │ └──────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Server 1 │ │Server 2 │ │Server 3 │
└─────────┘ └─────────┘ └─────────┘
Algorithms:
- Round Robin — distribute requests in order (simple)
- Least Connections — send to the server with fewest active connections
- Weighted — servers with more capacity get more traffic
- IP Hash — same client IP goes to same server (sticky sessions)
- Consistent Hashing — minimize reshuffling when servers are added/removed
3.6 CDN (Content Delivery Network)
CDNs cache static assets (images, CSS, JS, videos) at edge servers close to users.
User in Tokyo User in New York
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│ CDN Edge │ │ CDN Edge │
│ (Tokyo) │ │ (NYC) │
└────┬─────┘ └────┬─────┘
│ Cache MISS (first time) │
└──────────┐ ┌────────────────┘
▼ ▼
┌──────────┐
│ Origin │
│ Server │
└──────────┘
3.7 API Gateway
An API gateway is the single entry point for all client requests.
Responsibilities:
- Routing — forward requests to the right service
- Authentication — validate tokens before passing requests downstream
- Rate limiting — protect services from abuse
- Protocol translation — REST to gRPC, WebSocket management
- Response aggregation — combine results from multiple services
4. Architecture Diagrams
A good HLD diagram is clear, labeled, and tells a story of how data flows through the system.
Anatomy of a System Design Diagram
┌─────────────────────────────────────────────────────────────────────────┐
│ TYPICAL HLD DIAGRAM │
│ │
│ ┌────────┐ ┌─────────┐ ┌─────────────────────────────┐ │
│ │ Mobile │───────►│ CDN │ │ BACKEND │ │
│ │ App │ HTTPS │(static) │ │ │ │
│ └────────┘ └─────────┘ │ ┌───────┐ ┌──────────┐ │ │
│ │ │ API │───►│ Service │ │ │
│ ┌────────┐ ┌─────────┐ HTTP │ │Gateway│ │ A │ │ │
│ │ Web │───────►│ Load │──────►│ │ │ └────┬─────┘ │ │
│ │Browser │ HTTPS │Balancer │ │ │ │ │ │ │
│ └────────┘ └─────────┘ │ │ │ ┌────▼─────┐ │ │
│ │ │ │───►│ Service │ │ │
│ │ └───────┘ │ B │ │ │
│ │ └────┬─────┘ │ │
│ └────────────────────┼───────┘ │
│ │ │
│ ┌──────────┐ ┌─────────┐ ┌───▼────┐ │
│ │ Cache │◄───│ Service │◄─│ Queue │ │
│ │ (Redis) │ │ C │ │(Kafka) │ │
│ └──────────┘ └────┬────┘ └────────┘ │
│ │ │
│ ┌────▼────┐ │
│ │ DB │ │
│ │(Postgres)│ │
│ └─────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
Diagram Rules of Thumb
| Rule | Why |
|---|---|
| Draw clients on the left, backend on the right | Consistent reading direction |
| Label every arrow | Readers must know the protocol (HTTP, gRPC, TCP) |
| Name every box | "Service A" is not enough; use "Tweet Service" or "Auth Service" |
| Show data stores at the bottom | Convention: data sinks at the bottom, data sources at the top |
| Use different shapes | Rectangles for services, cylinders for databases, clouds for CDN/external |
| Number the flow | Walk through the request path: 1 → 2 → 3 |
5. HLD in Interviews
What Interviewers Are Evaluating
| Skill | What They Watch For |
|---|---|
| Requirements gathering | Do you ask clarifying questions or just start drawing? |
| Component selection | Can you pick the right database, cache, and queue for the use case? |
| Trade-off reasoning | Do you explain WHY you chose SQL over NoSQL? |
| Scalability thinking | Do you consider 1M users vs 100M users? |
| Communication | Can you explain your design clearly and structured? |
| Handling unknowns | What do you do when the interviewer pushes back or asks "what about X?" |
Common Interview Prompts
| Prompt | Core Challenge |
|---|---|
| "Design a URL shortener" | Hashing, read-heavy, simple CRUD |
| "Design Twitter / X" | Fan-out, timeline generation, caching |
| "Design YouTube" | Video storage, CDN, transcoding pipeline |
| "Design WhatsApp" | Real-time messaging, WebSockets, presence |
| "Design Uber" | Geospatial queries, matching, real-time updates |
| "Design a rate limiter" | Distributed counting, sliding windows |
| "Design a notification system" | Multiple channels, priority, delivery guarantees |
6. The Whiteboard Approach
Here is the step-by-step method for drawing an HLD on a whiteboard (or virtual whiteboard) in an interview.
┌─────────────────────────────────────────────────────────────┐
│ THE WHITEBOARD METHOD │
│ │
│ Step 1: REQUIREMENTS (top-left corner) │
│ ───────────────────── │
│ Write functional + non-functional requirements │
│ as bullet points. Leave them visible. │
│ │
│ Step 2: ESTIMATION (top-right corner) │
│ ────────────────── │
│ Write QPS, storage, bandwidth numbers. │
│ These guide your scaling decisions. │
│ │
│ Step 3: API DESIGN (below requirements) │
│ ───────────────── │
│ List 3-5 core API endpoints. │
│ POST /tweets, GET /timeline, GET /user/:id │
│ │
│ Step 4: HIGH-LEVEL DIAGRAM (center — the main event) │
│ ───────────────────────── │
│ Draw boxes, arrows, data stores. │
│ Walk through the request path. │
│ │
│ Step 5: DEEP DIVE (expand specific boxes) │
│ ────────────── │
│ Interviewer picks a component; you go deeper. │
│ Database schema, caching strategy, sharding. │
│ │
│ Step 6: TRADE-OFFS (bottom of board) │
│ ──────────────── │
│ Discuss what you chose and what you sacrificed. │
│ "I chose eventual consistency because..." │
└─────────────────────────────────────────────────────────────┘
Physical Board Layout
┌────────────────────────────────────────────────────────────┐
│ REQUIREMENTS │ ESTIMATION │
│ • Post tweets │ • 500M users │
│ • View timeline │ • 10K QPS reads │
│ • Follow users │ • 500 QPS writes │
│ • Like tweets │ • 5TB storage/year │
│ │ │
├───────────────────────┴────────────────────────────────────┤
│ │
│ [ ARCHITECTURE DIAGRAM HERE ] │
│ │
│ Client → LB → API Gateway → Services → DB/Cache │
│ │
├────────────────────────────────────────────────────────────┤
│ APIS │ TRADE-OFFS │
│ POST /tweet │ • SQL for users (ACID) │
│ GET /timeline?page=1 │ • NoSQL for tweets (scale) │
│ POST /follow │ • Push model for timeline │
│ GET /user/:id │ • Eventual consistency OK │
└────────────────────────────────────────────────────────────┘
7. Common Architecture Patterns
7.1 Monolithic Architecture
┌──────────────────────────────────────┐
│ MONOLITH │
│ │
│ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │ Auth │ │ Feed │ │ User │ │
│ │Module│ │Module│ │Module│ ... │
│ └──────┘ └──────┘ └──────┘ │
│ │
│ Single Deployable │
└──────────────────┬───────────────────┘
│
┌────▼────┐
│ DB │
└─────────┘
Pros: Simple deployment, easy debugging, low latency (in-process calls). Cons: Hard to scale independently, risky deployments, team coupling.
7.2 Microservices Architecture
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Auth │ │ Feed │ │ User │
│ Service │ │ Service │ │ Service │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Auth DB │ │ Feed DB │ │ User DB │
└─────────┘ └─────────┘ └─────────┘
Pros: Independent scaling, independent deployment, technology diversity. Cons: Network complexity, distributed transactions, operational overhead.
7.3 Event-Driven Architecture
┌─────────┐ ┌────────────┐ ┌─────────┐
│ Service │──event──►│ Event │──event──►│ Service │
│ A │ │ Bus │ │ B │
└─────────┘ │(Kafka/SNS) │ └─────────┘
│ │
│ │──event──►┌─────────┐
└────────────┘ │ Service │
│ C │
└─────────┘
Pros: Loose coupling, easy to add new consumers, natural audit trail. Cons: Eventual consistency, debugging across events is harder, ordering challenges.
8. Key Takeaways
- HLD is about the architecture of the entire system — services, data stores, communication, and scaling — not about classes and methods.
- Every HLD uses a toolkit of building blocks: services, databases, caches, queues, load balancers, CDNs, and API gateways.
- A good architecture diagram is labeled, numbered (showing request flow), and uses consistent conventions.
- In interviews, HLD is about structured thinking and trade-off reasoning, not memorizing solutions.
- Start with a monolith mentally, then decompose into services where it makes sense for the scale and requirements.
- The whiteboard approach gives you a systematic layout: requirements, estimation, APIs, diagram, deep dive, trade-offs.
9. Explain-It Challenge
Without looking back, explain in your own words:
- What is the difference between HLD and LLD? Give an example of a decision made at each level.
- Name six building blocks of HLD and explain when you would use each one.
- Why do we need both a cache and a database — why not just use one?
- What is the role of a message queue in an architecture? Give a real-world example.
- Walk through how you would lay out a whiteboard for a system design interview.
Navigation: ← 9.7 Overview · 9.7.b — Requirements Analysis →