Episode 9 — System Design / 9.8 — Communication and Data Layer

9.8.a — Networking Basics

Big Picture: Before you design any distributed system, you must understand how data physically travels from one machine to another. Networking is the plumbing of system design — invisible when it works, catastrophic when it fails.


Table of Contents

  1. How Data Travels: Client to Server
  2. DNS Resolution
  3. HTTP and HTTPS
  4. TCP vs UDP
  5. WebSockets
  6. gRPC and Protocol Buffers
  7. Latency and Bandwidth
  8. CDN Basics
  9. Network Topology in Distributed Systems
  10. Protocol Comparison Table
  11. Key Takeaways
  12. Explain-It Challenge

How Data Travels: Client to Server

When a user types https://shop.example.com/products into a browser, here is what happens:

  USER'S BROWSER                         THE INTERNET                        SERVER
  +-------------+                                                     +----------------+
  |  1. Type URL |                                                     |                |
  |  2. DNS      |----> DNS Resolver ----> Root NS ----> .com NS ---->| 6. Receive     |
  |     Lookup   |<---- IP: 93.184.216.34 <----- Authoritative NS <---|    request     |
  |              |                                                     |                |
  |  3. TCP      |---- SYN -------------------------------------------->|  7. Process   |
  |     Handshake|<--- SYN-ACK ----------------------------------------|               |
  |              |---- ACK -------------------------------------------->|               |
  |              |                                                     |                |
  |  4. TLS      |---- ClientHello -------------------------------------->| 8. Query DB |
  |     Handshake|<--- ServerHello + Certificate ----------------------|               |
  |              |---- Key Exchange ---------------------------------->|               |
  |              |                                                     |                |
  |  5. HTTP GET |---- GET /products HTTP/1.1 ------------------------>|  9. Build     |
  |     Request  |<--- 200 OK { "products": [...] } ------------------|    response   |
  |              |                                                     |                |
  | 10. Render   |                                                     | (Connection    |
  |     page     |                                                     |  may stay open |
  +-------------+                                                     |  for reuse)    |
                                                                      +----------------+

Step-by-step breakdown:

StepWhat HappensTime Cost
1. URL enteredBrowser parses scheme, host, path~0 ms
2. DNS lookupTranslates domain name to IP address20-120 ms
3. TCP handshakeThree-way handshake (SYN, SYN-ACK, ACK)1 RTT (~20-100 ms)
4. TLS handshakeNegotiates encryption (HTTPS only)1-2 RTTs (~40-200 ms)
5. HTTP requestSends the actual GET/POST requestDepends on payload
6-9. Server processingRoute, auth, business logic, DB query5-500 ms
10. ResponseData sent back, browser rendersDepends on payload

Interview insight: A single page load can involve 50-100+ network requests (HTML, CSS, JS, images, API calls). This is why CDNs, caching, and connection reuse (HTTP/2) matter.


DNS Resolution

DNS (Domain Name System) translates human-readable domain names into IP addresses.

  Browser                DNS Resolver           Root NS          .com NS         shop.example.com NS
    |                        |                    |                 |                   |
    |-- shop.example.com? -->|                    |                 |                   |
    |                        |-- "." (root)? ---->|                 |                   |
    |                        |<-- "ask .com NS" --|                 |                   |
    |                        |                    |                 |                   |
    |                        |-- ".com"? -------->|---------------->|                   |
    |                        |<-- "ask example.com NS" ------------|                   |
    |                        |                    |                 |                   |
    |                        |-- "shop.example.com"? ------------->|------------------>|
    |                        |<-- "93.184.216.34" ------------------------------------|
    |                        |                    |                 |                   |
    |<-- 93.184.216.34 ------|                    |                 |                   |

DNS Caching Layers

  +------------------+     +-------------------+     +------------------+     +-----------------+
  | Browser Cache    | --> | OS Cache          | --> | Router/ISP Cache | --> | DNS Resolver    |
  | (minutes)        |     | (/etc/hosts, stub)|     | (hours)          |     | (Recursive)     |
  +------------------+     +-------------------+     +------------------+     +-----------------+
         TTL: ~60s               TTL: varies              TTL: varies            TTL: configured

DNS Record Types

RecordPurposeExample
AMaps domain to IPv4shop.example.com -> 93.184.216.34
AAAAMaps domain to IPv6shop.example.com -> 2606:2800:220:1:...
CNAMEAlias to another domainwww.shop.com -> shop.example.com
MXMail serverexample.com -> mail.example.com
NSName server for the zoneexample.com -> ns1.example.com
TXTArbitrary text (SPF, verification)example.com -> "v=spf1 ..."

System design relevance: DNS-based load balancing returns different IPs for the same domain, spreading traffic across data centers. Services like Route 53 (AWS) can do geographic or latency-based routing at the DNS level.


HTTP and HTTPS

HTTP Versions

  HTTP/1.0          HTTP/1.1              HTTP/2                HTTP/3
  (1996)            (1997)                (2015)                (2022)
  +---------+       +---------+           +---------+           +---------+
  | 1 req   |       | Keep-   |           | Multi-  |           | QUIC    |
  | per     |       | alive   |           | plexing |           | (UDP)   |
  | conn    |       | Pipelining|         | Binary  |           | 0-RTT   |
  +---------+       | (rarely  |          | Frames  |           | No HOL  |
                    |  used)   |          | Header  |           | blocking|
                    +---------+           | Compress|           +---------+
                                          | Server  |
                                          | Push    |
                                          +---------+
VersionKey FeatureConnection Model
HTTP/1.0Basic request-responseNew TCP connection per request
HTTP/1.1Keep-alive, chunked transferPersistent connections, but head-of-line blocking
HTTP/2Multiplexing, header compressionSingle TCP connection, multiple streams
HTTP/3QUIC (UDP-based)No TCP head-of-line blocking, faster handshakes

HTTPS (TLS)

HTTPS = HTTP + TLS (Transport Layer Security). TLS provides:

  • Encryption — Data cannot be read in transit
  • Authentication — Server proves identity via certificate
  • Integrity — Data cannot be tampered with
  Client                                  Server
    |---- ClientHello (supported ciphers) ---->|
    |<--- ServerHello + Certificate -----------|
    |                                          |
    |  (Client verifies certificate against    |
    |   trusted Certificate Authorities)       |
    |                                          |
    |---- Key Exchange (PreMasterSecret) ----->|
    |                                          |
    |  (Both sides derive session keys)        |
    |                                          |
    |<========= Encrypted HTTP traffic =======>|

TCP vs UDP

  TCP (Transmission Control Protocol)        UDP (User Datagram Protocol)
  ===================================        ============================

  +-------+    SYN     +-------+             +-------+  data   +-------+
  |Client |----------->|Server |             |Client |-------->|Server |
  |       |<-----------|       |             |       |         |       |
  |       |  SYN-ACK   |       |             +-------+         +-------+
  |       |----------->|       |              (fire and forget)
  |       |    ACK     |       |
  +-------+            +-------+
  (connection established)

  Features:                                  Features:
  - Reliable delivery (ACKs)                 - No connection setup
  - Ordered packets                          - No guaranteed delivery
  - Flow control                             - No ordering
  - Congestion control                       - Minimal overhead
  - Error checking                           - Lower latency

Comparison Table

FeatureTCPUDP
ConnectionConnection-oriented (handshake)Connectionless
ReliabilityGuaranteed delivery (retransmits)Best effort
OrderingPackets arrive in orderNo ordering guarantee
SpeedSlower (overhead)Faster (minimal overhead)
Header size20-60 bytes8 bytes
Use casesWeb (HTTP), email, file transferVideo streaming, gaming, DNS, VoIP

When to Use What

ScenarioChooseWhy
REST APITCP (HTTP)Need reliable, ordered delivery
Live video streamingUDPDropped frame is better than delayed stream
Online gamingUDPLow latency matters more than every packet
File downloadTCPEvery byte must arrive correctly
DNS queryUDPSmall, single request-response
Chat messagesTCP (WebSocket)Every message must arrive

WebSockets

WebSockets provide full-duplex, persistent communication over a single TCP connection.

  HTTP Request-Response                WebSocket
  ========================            ========================

  Client        Server                Client        Server
    |-- GET ------->|                   |-- GET (Upgrade) ->|
    |<-- 200 -------|                   |<-- 101 Switching--|
    |               |                   |                   |
    |-- GET ------->|                   |<== Bidirectional =>|
    |<-- 200 -------|                   |<== messages ======>|
    |               |                   |<== flowing =======>|
    |-- GET ------->|                   |                   |
    |<-- 200 -------|                   |-- close --------->|
    |               |                   |<-- close ---------|

WebSocket Handshake

# Client Request (HTTP Upgrade)
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

# Server Response
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

When to Use WebSockets

Use CaseWhy WebSockets
Chat applicationsReal-time bidirectional messaging
Live notificationsServer pushes updates instantly
Collaborative editingMultiple users editing same document
Live sports scoresContinuous real-time updates
Stock tickersHigh-frequency price updates

When NOT to Use WebSockets

  • Simple CRUD operations — REST is simpler and sufficient
  • Infrequent updates — Long polling or SSE may be lighter
  • One-directional server-to-client — Server-Sent Events (SSE) is simpler

Alternative: Server-Sent Events (SSE)

  SSE (Server-Sent Events)
  ========================

  Client        Server
    |-- GET ------->|
    |<-- text/event-stream --|
    |<-- data: update 1 -----|
    |<-- data: update 2 -----|
    |<-- data: update 3 -----|
    |               |
    (one-directional: server to client only)
FeatureWebSocketSSELong Polling
DirectionBidirectionalServer -> ClientServer -> Client
ProtocolWS (TCP)HTTPHTTP
ReconnectionManualAutomaticManual
Binary dataYesNo (text only)Yes
ComplexityMediumLowLow

gRPC and Protocol Buffers

gRPC is a high-performance RPC framework built on HTTP/2, using Protocol Buffers for serialization.

  REST (JSON over HTTP)                  gRPC (Protobuf over HTTP/2)
  =====================                  ==========================

  POST /api/users                        service UserService {
  Content-Type: application/json           rpc GetUser(UserRequest)
  { "name": "Alice", "age": 30 }              returns (UserResponse);
                                          }
  ~100 bytes                              ~20 bytes (binary)

  Human-readable                          Not human-readable
  Slower serialization                    10x faster serialization
  Any HTTP client works                   Needs gRPC client/codegen

Protocol Buffer Example

// user.proto
syntax = "proto3";

service UserService {
  rpc GetUser (GetUserRequest) returns (User);
  rpc ListUsers (ListUsersRequest) returns (stream User);  // server streaming
  rpc Chat (stream Message) returns (stream Message);       // bidirectional
}

message GetUserRequest {
  string user_id = 1;
}

message User {
  string id = 1;
  string name = 2;
  string email = 3;
  int32 age = 4;
}

gRPC Communication Patterns

  1. Unary             2. Server Streaming    3. Client Streaming    4. Bidirectional
  ============         ==================     ==================     ================
  Client    Server     Client    Server       Client    Server       Client    Server
    |--req--->|          |--req--->|            |--msg1-->|            |--msg-->|
    |<--res---|          |<--msg1--|            |--msg2-->|            |<--msg--|
                         |<--msg2--|            |--msg3-->|            |--msg-->|
                         |<--msg3--|            |<--res---|            |<--msg--|

When to Use gRPC vs REST

FactorRESTgRPC
Client typeBrowsers, any HTTP clientBackend services (needs client lib)
Payload sizeLarger (JSON text)Smaller (binary protobuf)
PerformanceGoodExcellent (2-10x faster)
Browser supportNativeLimited (needs gRPC-Web proxy)
StreamingWorkarounds (SSE, WebSocket)Native (4 patterns)
SchemaOpenAPI/Swagger (optional).proto files (mandatory, strongly typed)
DebuggingEasy (human-readable JSON)Harder (binary)
Best forPublic APIs, web appsMicroservice-to-microservice

Latency and Bandwidth

Latency

Latency = time for a single unit of data to travel from source to destination.

  LATENCY NUMBERS EVERY DEVELOPER SHOULD KNOW
  ============================================

  L1 cache reference .......................... 0.5 ns
  L2 cache reference ..........................   7 ns
  Main memory reference ....................... 100 ns
  SSD random read .......................... 16,000 ns   (16 us)
  HDD random read ....................... 2,000,000 ns   (2 ms)
  Send packet CA -> Netherlands -> CA ... 150,000,000 ns (150 ms)

  NETWORK LATENCY (approximate round-trip)
  ========================================
  Same data center ......................... 0.5 ms
  Same region (e.g., us-east) .............. 1-5 ms
  Cross-region (US East -> US West) ........ 30-70 ms
  Cross-continent (US -> Europe) ........... 80-150 ms
  Cross-world (US -> Australia) ............ 150-300 ms

Bandwidth

Bandwidth = maximum data that can be transferred per unit of time.

  BANDWIDTH COMPARISON
  ====================
  3G Mobile ................ 1-5 Mbps
  4G/LTE Mobile ............ 10-50 Mbps
  5G Mobile ................ 100-1000 Mbps
  Home Wi-Fi ............... 50-500 Mbps
  Ethernet (Office) ........ 1 Gbps
  Data center internal ...... 10-100 Gbps
  AWS region to region ...... 5-25 Gbps

Latency vs Bandwidth

ConceptAnalogyMatters When
LatencyHow long it takes a truck to drive from A to BMany small requests (API calls)
BandwidthHow much cargo the truck can carryLarge data transfers (video, backups)
ThroughputActual cargo delivered per hour (real-world)Overall system capacity

Key insight: For most web applications, latency dominates. Reducing round trips (batching, caching, CDNs) often matters more than increasing bandwidth.


CDN Basics

A CDN (Content Delivery Network) caches content at edge servers geographically close to users.

  WITHOUT CDN                              WITH CDN
  ===========                              ========

  User (Tokyo)                             User (Tokyo)
    |                                        |
    |--- 150ms round trip -----> Origin      |--- 5ms ---> CDN Edge (Tokyo)
    |    (US East)               Server      |             |
    |<--- 150ms ------------------|          |<-- 5ms -----|
    |                                        |
    Total: ~300ms+ per request               Total: ~10ms per request
                                             (Origin only hit on cache miss)

What CDNs Cache

Content TypeCacheable?TTL Strategy
Static files (JS, CSS, images)AlwaysLong TTL (days/weeks)
HTML pagesOftenShort TTL (minutes) or invalidation
API responsesSometimesShort TTL with cache headers
User-specific dataRarelyUsually not cached at CDN
Video/audioAlwaysLong TTL

CDN Cache Flow

  User ----> CDN Edge
               |
               |-- Cache HIT? --> Return cached content (fast)
               |
               |-- Cache MISS? --> Fetch from origin server
               |                   |
               |                   |--> Store in edge cache
               |                   |--> Return to user

Popular CDN Providers

ProviderStrengths
CloudFlareDDoS protection, free tier, Workers (edge compute)
AWS CloudFrontDeep AWS integration, Lambda@Edge
AkamaiLargest network, enterprise-grade
FastlyReal-time purging, edge compute (Compute@Edge)
Google Cloud CDNIntegrates with GCP load balancer

Network Topology in Distributed Systems

Common Patterns

  1. CLIENT-SERVER                     2. PEER-TO-PEER
  ==================                  ================

  +--------+    +--------+            +---+     +---+
  | Client |--->| Server |            | A |<--->| B |
  +--------+    +--------+            +---+     +---+
  +--------+        ^                   ^  \   /  ^
  | Client |--------|                   |   \ /   |
  +--------+                            v    X    v
                                       +---+/ \+---+
                                       | C |<->| D |
                                       +---+   +---+

  3. HUB AND SPOKE                    4. MESH (Microservices)
  ==================                  ======================

  +---+     +-----+     +---+        +---+   +---+   +---+
  | A |---->| HUB |<----| B |        | A |<->| B |<->| C |
  +---+     +-----+     +---+        +---+   +---+   +---+
              ^ ^                       |       |       |
  +---+      | |       +---+         +---+   +---+   +---+
  | C |------+ +-------| D |        | D |<->| E |<->| F |
  +---+                +---+        +---+   +---+   +---+

Service Mesh

In microservice architectures, a service mesh manages service-to-service communication:

  +-------------------+          +-------------------+
  | Service A         |          | Service B         |
  |  +-------------+  |          |  +-------------+  |
  |  | App Code    |  |          |  | App Code    |  |
  |  +------+------+  |          |  +------+------+  |
  |         |         |          |         |         |
  |  +------v------+  |  mTLS   |  +------v------+  |
  |  | Sidecar     |<------------>  | Sidecar     |  |
  |  | Proxy       |  |          |  | Proxy       |  |
  |  | (Envoy)     |  |          |  | (Envoy)     |  |
  |  +-------------+  |          |  +-------------+  |
  +-------------------+          +-------------------+
         |                              |
         +---------> Control Plane <----+
                     (Istio, Linkerd)

The sidecar proxy handles:

  • Service discovery — Finding other services
  • Load balancing — Distributing requests
  • mTLS — Mutual TLS encryption
  • Retries and circuit breaking — Resilience
  • Observability — Metrics, tracing, logging

Protocol Comparison Table

ProtocolLayerConnectionUse CaseLatencyComplexity
HTTP/1.1ApplicationTCP, persistentWeb apps, REST APIsMediumLow
HTTP/2ApplicationTCP, multiplexedWeb apps, APIsLowLow
HTTP/3ApplicationQUIC (UDP)Modern webLowestMedium
WebSocketApplicationTCP, persistentReal-time bidirectionalLowMedium
SSEApplicationHTTP, persistentReal-time server pushLowLow
gRPCApplicationHTTP/2Microservices RPCVery LowMedium
TCPTransportConnection-orientedReliable data transferMediumN/A
UDPTransportConnectionlessStreaming, gamingLowN/A
DNSApplicationUDP (usually)Name resolutionVariableN/A

Key Takeaways

  1. Every network request adds latency — DNS lookup, TCP handshake, TLS handshake, data transfer. Minimize round trips.
  2. TCP guarantees delivery; UDP guarantees speed — Choose based on whether correctness or timeliness matters more.
  3. WebSockets are for real-time bidirectional communication — Do not use them for simple request-response patterns.
  4. gRPC excels for internal microservice communication — Binary serialization, streaming, and strong typing make it faster than REST for service-to-service calls.
  5. CDNs reduce latency by moving data closer to users — Static assets should almost always be served from a CDN.
  6. DNS is a potential bottleneck and a load-balancing tool — Caching and geographic routing at the DNS layer are common in system design.
  7. Latency dominates bandwidth for most web apps — Focus on reducing round trips, not just increasing pipe size.

Explain-It Challenge

Scenario: Your friend asks you how a video call (like Zoom) works at the network level. Explain:

  1. Why video and audio use UDP instead of TCP
  2. What happens when a packet is lost during a call
  3. Why there is a slight delay when talking to someone on another continent
  4. How a CDN would NOT help with a live video call (but helps with pre-recorded videos)

Keep your explanation under 2 minutes, as if talking to a non-technical person.


Next -> 9.8.b — API Design