Episode 9 — System Design / 9.7 — System Design Foundations

9.7.c — Breaking Into Components

In one sentence: Decomposing a system into well-bounded components (services) is the core skill of high-level design — it determines how independently your teams can build, deploy, and scale each piece.

Navigation: ← 9.7.b — Requirements Analysis · 9.7.d — Capacity Estimation →

1. Why Decompose?
2. Identifying Service Boundaries
3. Data Flow Between Components
4. API Contracts
5. Dependency Mapping
6. Component Diagram Examples
7. Common Decomposition Patterns
8. Anti-Patterns in Decomposition
9. Key Takeaways
10. Explain-It Challenge

1. Why Decompose?

A monolithic blob that does everything is simple to start with, but as a system grows, it becomes a bottleneck for development speed, scalability, and reliability.

  MONOLITH                          DECOMPOSED SYSTEM
  ────────                          ──────────────────

  ┌──────────────────┐              ┌─────────┐  ┌─────────┐
  │  Everything in   │              │  Auth   │  │  Feed   │
  │  one big box     │    ──────►   │ Service │  │ Service │
  │                  │              └─────────┘  └─────────┘
  │  • Auth          │
  │  • Feed          │              ┌─────────┐  ┌─────────┐
  │  • Users         │              │  User   │  │  Media  │
  │  • Media         │              │ Service │  │ Service │
  │  • Search        │              └─────────┘  └─────────┘
  │  • Notifications │
  └──────────────────┘              ┌─────────┐
                                    │ Search  │
                                    │ Service │
                                    └─────────┘

Benefit	Explanation
Independent scaling	Scale the read-heavy Feed Service without scaling the Auth Service
Independent deployment	Deploy a fix to Search without redeploying the whole system
Team ownership	Each team owns a service boundary — less coordination overhead
Fault isolation	If Media Service crashes, Auth and Feed keep working
Technology flexibility	Feed Service can use Redis; Search can use Elasticsearch

2. Identifying Service Boundaries

The hardest part of decomposition is deciding where to draw the lines. Here are proven heuristics.

Heuristic 1: Single Responsibility

Each service should own one business capability.

Good Boundary	Bad Boundary
"User Service handles registration, authentication, and profile management"	"UserAndTweetService handles user profiles AND tweet creation"
"Payment Service handles all billing logic"	"MiscService handles payments, emails, and logging"

Heuristic 2: Data Ownership

Each service should own its own data store. If two services need the same table, that is a sign they should be one service or the data should be shared via APIs.

  GOOD: Each service owns its data         BAD: Shared database

  ┌─────────┐    ┌─────────┐               ┌─────────┐    ┌─────────┐
  │  User   │    │  Order  │               │  User   │    │  Order  │
  │ Service │    │ Service │               │ Service │    │ Service │
  └────┬────┘    └────┬────┘               └────┬────┘    └────┬────┘
       │              │                         │              │
  ┌────▼────┐    ┌────▼────┐                    └──────┬───────┘
  │ User DB │    │Order DB │                     ┌─────▼─────┐
  └─────────┘    └─────────┘                     │ Shared DB │  ← coupling!
                                                 └───────────┘

Heuristic 3: Rate of Change

Components that change frequently should be separate from stable components.

Changes Often	Changes Rarely
Recommendation algorithm	User authentication
Search ranking	Payment processing
UI/BFF (Backend for Frontend)	Core data models

Heuristic 4: Scaling Needs

Components with different scaling profiles should be separate.

Component	Scaling Profile
Image upload	CPU-intensive (resizing), bursty
Timeline read	Memory-intensive (caching), constant high throughput
Notification	I/O-heavy (external APIs), can be async
Authentication	Low volume, must be always available

Heuristic 5: Domain-Driven Design (DDD)

Group by bounded context — a business domain with clear boundaries.

  ┌─────────────────────────────────────────────────────────────┐
  │                     E-COMMERCE SYSTEM                        │
  │                                                              │
  │  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐    │
  │  │   CATALOG    │   │   ORDERING   │   │   SHIPPING   │    │
  │  │   Context    │   │   Context    │   │   Context    │    │
  │  │              │   │              │   │              │    │
  │  │ • Product    │   │ • Cart       │   │ • Shipment   │    │
  │  │ • Category   │   │ • Order      │   │ • Tracking   │    │
  │  │ • Inventory  │   │ • Payment    │   │ • Carrier    │    │
  │  │ • Pricing    │   │ • Invoice    │   │ • Label      │    │
  │  └──────────────┘   └──────────────┘   └──────────────┘    │
  │                                                              │
  │  Each context has its own data model and service boundary    │
  └─────────────────────────────────────────────────────────────┘

3. Data Flow Between Components

Once you have services, you need to define how data flows between them.

Synchronous Communication (Request-Response)

  ┌─────────┐    HTTP/gRPC     ┌─────────┐
  │ Service │ ───────────────► │ Service │
  │    A    │ ◄─────────────── │    B    │
  └─────────┘    response      └─────────┘

  • A WAITS for B to respond
  • Simple and predictable
  • Creates coupling: if B is slow, A is slow
  • If B is down, A may fail

Use when: You need an immediate answer (e.g., "Is this user authenticated?")

Asynchronous Communication (Event/Message)

  ┌─────────┐    publish       ┌──────────┐    consume    ┌─────────┐
  │ Service │ ───────────────► │  Queue/  │ ────────────► │ Service │
  │    A    │                  │  Topic   │               │    B    │
  └─────────┘                  └──────────┘               └─────────┘

  • A does NOT wait for B
  • Decoupled: A doesn't even know B exists
  • B can process at its own pace
  • If B is down, messages queue up (no data loss)

Use when: The caller doesn't need an immediate result (e.g., "Send a welcome email after signup")

Comparison Table

Aspect	Synchronous (REST/gRPC)	Asynchronous (Queue/Event)
Latency	Caller waits for response	Caller returns immediately
Coupling	Tight (A knows about B)	Loose (A publishes; anyone can subscribe)
Failure handling	A fails if B fails	Messages buffer; B processes when ready
Debugging	Easy (request-response trace)	Harder (events across services)
Use case	Auth check, data fetch	Notifications, video processing, analytics

Hybrid Pattern (Common in Practice)

  Client                API Gateway           Tweet Service         Queue          Fan-out Worker
    │                      │                      │                  │                  │
    │  POST /tweet         │                      │                  │                  │
    │─────────────────────►│                      │                  │                  │
    │                      │  createTweet()       │                  │                  │
    │                      │─────────────────────►│                  │                  │
    │                      │                      │  publish event   │                  │
    │                      │                      │─────────────────►│                  │
    │                      │  { id: 123 }         │                  │                  │
    │                      │◄─────────────────────│                  │                  │
    │  201 Created         │                      │                  │  consume event   │
    │◄─────────────────────│                      │                  │─────────────────►│
    │                      │                      │                  │  update timelines│
    │  (synchronous         │                      │                 │  (asynchronous   │
    │   response)          │                      │                  │   processing)    │

4. API Contracts

API contracts define the interface between services. They are the "handshake" agreement.

REST API Contract Example

  POST /api/v1/tweets
  ─────────────────────
  Headers:
    Authorization: Bearer <token>
    Content-Type: application/json

  Request Body:
    {
      "text": "Hello, world!",
      "media_ids": ["img_123"]
    }

  Response (201 Created):
    {
      "id": "tweet_456",
      "text": "Hello, world!",
      "author_id": "user_789",
      "created_at": "2025-01-15T10:30:00Z",
      "media": [{ "id": "img_123", "url": "https://cdn.example.com/img_123.jpg" }]
    }

  Error Response (400 Bad Request):
    {
      "error": "TWEET_TOO_LONG",
      "message": "Tweet exceeds 280 characters"
    }

API Design Principles

Principle	Explanation
Versioning	Use `/v1/`, `/v2/` to avoid breaking existing clients
Idempotency	Retrying the same PUT/DELETE should not cause duplicates
Pagination	Large lists must support `?page=1&limit=20` or cursor-based pagination
Consistent naming	Use nouns for resources: `/tweets`, `/users`, not `/createTweet`
Error codes	Return meaningful error codes and messages, not just 500
Rate limiting	Protect your API with per-user rate limits (e.g., 100 req/min)

Service-to-Service Contract

  ┌──────────────────────────────────────────────────────────────┐
  │                  SERVICE CONTRACT                              │
  │                                                                │
  │  Provider: User Service                                        │
  │  Consumer: Tweet Service                                       │
  │                                                                │
  │  Endpoint: GET /internal/users/{user_id}                       │
  │  Purpose:  Fetch user data for tweet enrichment                │
  │  SLA:      P99 latency < 50ms, 99.99% availability            │
  │                                                                │
  │  Response:                                                     │
  │    { "id": "user_789", "name": "Alice", "avatar_url": "..." } │
  │                                                                │
  │  What if User Service is down?                                 │
  │    → Tweet Service uses cached user data (stale up to 5 min)   │
  └──────────────────────────────────────────────────────────────┘

5. Dependency Mapping

Before finalizing your design, map out which services depend on which. Look for:

Circular dependencies (A calls B, B calls A — redesign needed)
Single points of failure (everything depends on one service)
Critical path (the chain of calls that determines end-to-end latency)

Dependency Diagram

  ┌────────────────────────────────────────────────────────────┐
  │                    DEPENDENCY MAP                            │
  │                                                              │
  │                   ┌───────────┐                              │
  │                   │ API       │                              │
  │                   │ Gateway   │                              │
  │                   └─────┬─────┘                              │
  │                    ┌────┼────┐                               │
  │                    ▼    ▼    ▼                               │
  │              ┌──────┐┌──────┐┌──────┐                       │
  │              │ Auth ││Tweet ││Search│                       │
  │              │  Svc ││ Svc  ││ Svc  │                       │
  │              └──┬───┘└──┬───┘└──┬───┘                       │
  │                 │       │       │                            │
  │                 ▼       ▼       ▼                            │
  │              ┌──────┐┌──────┐┌──────────┐                   │
  │              │ User ││ Feed ││ Elastic  │                   │
  │              │ Svc  ││ Svc  ││ Search   │                   │
  │              └──┬───┘└──┬───┘└──────────┘                   │
  │                 │       │                                    │
  │                 ▼       ▼                                    │
  │              ┌──────┐┌──────┐                               │
  │              │UserDB││FeedDB│                               │
  │              └──────┘└──────┘                               │
  │                                                              │
  │  Critical path (timeline read):                              │
  │  Gateway → Feed Svc → Feed DB + User Svc → User DB          │
  │  Latency budget: 200ms total                                 │
  └────────────────────────────────────────────────────────────┘

Reducing Dependencies

Problem	Solution
Service A calls B calls C calls A (circular)	Introduce an event bus or merge A and C
All services call Auth Service (bottleneck)	Cache auth tokens; use JWT for stateless validation
Single database for everything (SPOF)	Each service owns its database; replicate for reads
Synchronous chain of 5 services (latency)	Make non-critical calls async via queues

6. Component Diagram Examples

Example 1: URL Shortener

  ┌────────────┐       ┌─────────────┐       ┌──────────────┐
  │  Clients   │──────►│   Load      │──────►│  URL Service │
  │(Browser/   │ HTTPS │  Balancer   │       │              │
  │ API)       │       └─────────────┘       │ • shorten()  │
  └────────────┘                             │ • redirect() │
                                             │ • analytics()│
                                             └──────┬───────┘
                                                    │
                                   ┌────────────────┼────────────────┐
                                   ▼                ▼                ▼
                              ┌─────────┐     ┌─────────┐     ┌──────────┐
                              │  Cache  │     │   DB    │     │Analytics │
                              │ (Redis) │     │(Postgres│     │  Store   │
                              │         │     │ / Cass.)│     │(ClickHs.)│
                              │ short→  │     │ short→  │     │ clicks,  │
                              │  long   │     │  long   │     │ geo, ts  │
                              └─────────┘     └─────────┘     └──────────┘

  Data flow (redirect):
  1. Client hits /abc123
  2. URL Service checks Redis cache
  3. Cache HIT → redirect immediately (< 10ms)
  4. Cache MISS → query DB, populate cache, redirect
  5. Log click event to Analytics Store (async)

Example 2: Chat Application (WhatsApp-like)

  ┌──────────┐                    ┌──────────┐
  │ Mobile   │◄──── WebSocket ───►│  Chat    │
  │  App     │                    │ Gateway  │
  └──────────┘                    └────┬─────┘
                                       │
                          ┌────────────┼────────────┐
                          ▼            ▼            ▼
                    ┌──────────┐ ┌──────────┐ ┌──────────┐
                    │ Presence │ │ Message  │ │  Group   │
                    │ Service  │ │ Service  │ │ Service  │
                    └────┬─────┘ └────┬─────┘ └────┬─────┘
                         │            │            │
                    ┌────▼─────┐ ┌────▼─────┐ ┌───▼──────┐
                    │  Redis   │ │ Cassandra│ │  MySQL   │
                    │ (online/ │ │ (messages│ │ (groups, │
                    │  offline)│ │  by chat)│ │  members)│
                    └──────────┘ └──────────┘ └──────────┘

  Key decisions:
  • WebSocket for real-time delivery
  • Cassandra for messages (write-heavy, time-series)
  • Redis for presence (fast reads, ephemeral data)
  • MySQL for groups (relational, fewer writes)

Example 3: E-Commerce Platform

  ┌────────┐     ┌─────┐     ┌───────────┐
  │  Web   │────►│ CDN │     │    API    │
  │Browser │     │     │     │  Gateway  │
  └────────┘     └─────┘     └─────┬─────┘
                                   │
              ┌────────────────────┼────────────────────┐
              ▼                    ▼                    ▼
        ┌──────────┐        ┌──────────┐        ┌──────────┐
        │ Product  │        │  Order   │        │ Payment  │
        │ Service  │        │ Service  │        │ Service  │
        └────┬─────┘        └────┬─────┘        └────┬─────┘
             │                   │                   │
        ┌────▼─────┐        ┌────▼─────┐        ┌────▼─────┐
        │ Product  │        │ Order DB │        │ Payment  │
        │ DB + ES  │        │ (MySQL)  │        │ Gateway  │
        │ (search) │        └──────────┘        │ (Stripe) │
        └──────────┘                            └──────────┘
                   ┌────────────┐
                   │   Queue    │
                   │  (Kafka)   │
                   └──────┬─────┘
                          │
              ┌───────────┼───────────┐
              ▼           ▼           ▼
        ┌──────────┐ ┌──────────┐ ┌──────────┐
        │Inventory │ │  Email   │ │Shipping  │
        │ Update   │ │ Notif.  │ │ Service  │
        └──────────┘ └──────────┘ └──────────┘

7. Common Decomposition Patterns

Pattern 1: Backend for Frontend (BFF)

  ┌──────────┐    ┌──────────┐    ┌──────────┐
  │  Mobile  │    │   Web    │    │  Smart   │
  │   App    │    │  Browser │    │    TV    │
  └────┬─────┘    └────┬─────┘    └────┬─────┘
       │               │               │
  ┌────▼─────┐    ┌────▼─────┐    ┌────▼─────┐
  │ Mobile   │    │  Web     │    │  TV      │
  │   BFF    │    │   BFF    │    │   BFF    │
  └────┬─────┘    └────┬─────┘    └────┬─────┘
       │               │               │
       └───────────────┼───────────────┘
                       ▼
                 ┌──────────┐
                 │ Shared   │
                 │ Services │
                 └──────────┘

Each BFF tailors the API for its specific client (mobile needs less data, TV needs bigger images, etc.).

Pattern 2: Gateway Aggregation

  Client makes ONE request:
    GET /dashboard

  API Gateway calls multiple services in parallel:
    ├── User Service      → user profile
    ├── Order Service     → recent orders
    ├── Notification Svc  → unread count
    └── Recommendation    → suggested products

  Gateway aggregates responses and returns a single JSON

Pattern 3: Strangler Fig (Migration)

Gradually replace a monolith by routing specific endpoints to new services.

  Phase 1: All traffic → Monolith
  Phase 2: /api/auth → New Auth Service; everything else → Monolith
  Phase 3: /api/auth → Auth Service; /api/feed → New Feed Service; rest → Monolith
  Phase 4: Monolith is empty → decomission

8. Anti-Patterns in Decomposition

Anti-Pattern	Problem	Fix
Distributed monolith	Services are split but tightly coupled; must deploy together	Ensure each service can deploy and fail independently
Shared database	Multiple services read/write the same tables	Each service owns its own DB; share data via APIs or events
Chatty services	Service A makes 20 calls to Service B per request	Batch endpoints, denormalize data, or merge services
God service	One service does everything (it IS the monolith)	Apply single responsibility; break out distinct capabilities
Nano services	Over-decomposition; 50 services for a simple app	Merge small services until each has meaningful responsibility
Circular dependencies	A depends on B, B depends on A	Introduce events, a shared library, or merge

9. Key Takeaways

Decompose by business capability — each service should own one domain (User, Order, Payment).
Each service owns its data — shared databases create coupling that defeats the purpose of decomposition.
Use synchronous calls (REST/gRPC) when you need an immediate answer; use asynchronous messaging (queues/events) when you don't.
Map your dependencies — look for circular dependencies, single points of failure, and long synchronous chains.
API contracts are the glue between services — version them, document them, and design for failure.
Avoid over-decomposition — start with fewer, larger services and split when there is a clear reason (scaling, team ownership, rate of change).

10. Explain-It Challenge

Without looking back, explain in your own words:

Name five heuristics for deciding where to draw service boundaries.
When would you use synchronous communication between services vs asynchronous? Give an example of each.
What is a distributed monolith and why is it worse than a real monolith?
Draw a simple component diagram for a food delivery app (Uber Eats-like) with at least 4 services.
What is the strangler fig pattern and when would you use it?