Episode 9 — System Design / 9.9 — Core Infrastructure

9.9.d API Gateway

What API Gateways Do

An API Gateway is a single entry point for all client requests to a backend system. It acts as a reverse proxy that routes requests to appropriate services while handling cross-cutting concerns like authentication, rate limiting, and request transformation.

  WITHOUT API Gateway:
  
  Mobile App --> User Service     (auth, rate limiting in each service)
  Mobile App --> Order Service    (auth, rate limiting in each service)
  Mobile App --> Payment Service  (auth, rate limiting in each service)
  Web App    --> User Service     (duplicated logic everywhere)
  Web App    --> Order Service
  
  Problems: Client must know all service URLs, auth duplicated, no central control
  
  
  WITH API Gateway:
  
  Mobile App ---+
                |
  Web App ------+--> API Gateway --> User Service
                |         |      --> Order Service
  Partner API --+         |      --> Payment Service
                          |
                    Handles: Auth, Rate Limiting, Routing,
                    Logging, Transformation, Caching

Core Responsibilities

1. Request Routing

The gateway maps external URLs to internal service endpoints.

  Routing Table:
  
  External URL                    Internal Service
  ─────────────────────────────────────────────────
  GET  /api/users/*            -> user-service:8080/users/*
  POST /api/orders/*           -> order-service:8081/orders/*
  GET  /api/products/*         -> product-service:8082/products/*
  POST /api/payments/*         -> payment-service:8083/payments/*
  
  
  Example:
  Client: GET https://api.example.com/api/users/42
  Gateway routes to: http://user-service:8080/users/42

Path-based routing:

  /api/v1/* --> Service Pool A (version 1)
  /api/v2/* --> Service Pool B (version 2)

Header-based routing:

  X-Client-Type: mobile  --> Mobile-optimized service
  X-Client-Type: web     --> Web service

Weight-based routing (canary deployments):

  /api/users/* --> Service v1 (90% of traffic)
                --> Service v2 (10% of traffic)

2. Authentication and Authorization

The gateway validates identity and permissions before requests reach backend services.

  Authentication Flow:
  
  Client                  API Gateway              Auth Service        Backend
    |                         |                         |                 |
    |-- Request + JWT ------->|                         |                 |
    |                         |-- Validate JWT -------->|                 |
    |                         |<-- Valid, user=42 ------|                 |
    |                         |                         |                 |
    |                         |-- Forward request ----->|---------------->|
    |                         |   + X-User-Id: 42      |                 |
    |                         |   + X-Roles: admin     |                 |
    |<-- Response ------------|<------------------------|<----------------|
    
  Backend services TRUST the gateway's headers.
  They never validate JWT themselves.

Common auth patterns at the gateway:

Pattern	How It Works	Use Case
JWT validation	Gateway verifies JWT signature and claims	Stateless auth
OAuth2 token introspection	Gateway calls auth server to validate opaque token	Third-party tokens
API key validation	Gateway checks API key against a registry	Partner/developer APIs
mTLS	Gateway verifies client certificate	Service-to-service, B2B
Basic Auth	Gateway checks username/password	Simple internal APIs

3. Rate Limiting

The gateway controls how many requests a client can make in a given time window.

  Rate Limiting Example:
  
  Plan: Free tier = 100 requests/minute
  
  Request  1:  200 OK  (remaining: 99)
  Request  2:  200 OK  (remaining: 98)
  ...
  Request 100: 200 OK  (remaining: 0)
  Request 101: 429 Too Many Requests
               Retry-After: 30
  
  Response headers:
  X-RateLimit-Limit: 100
  X-RateLimit-Remaining: 0
  X-RateLimit-Reset: 1680000060

Rate limiting algorithms:

Algorithm	Description	Pros	Cons
Fixed Window	Count requests in fixed time windows (e.g., per minute)	Simple	Burst at window boundary
Sliding Window Log	Track timestamps of all requests	Accurate	Memory-intensive
Sliding Window Counter	Hybrid of fixed window + weighted overlap	Good balance	Slightly approximate
Token Bucket	Tokens added at fixed rate; request consumes a token	Allows bursts	More complex
Leaky Bucket	Requests processed at fixed rate; excess queued/dropped	Smooth output	No burst handling

  Token Bucket:
  
  Bucket capacity: 10 tokens
  Refill rate: 2 tokens/second
  
  t=0s:  [##########]  10 tokens (full)
  t=0s:  3 requests --> [#######]  7 tokens
  t=1s:  refill +2  --> [#########]  9 tokens
  t=1s:  5 requests --> [####]  4 tokens
  t=2s:  refill +2  --> [######]  6 tokens
  
  If bucket empty: reject request (429)

Rate limiting dimensions:

Per user/API key
Per IP address
Per endpoint
Per service/tenant
Global (total system throughput)

4. Request/Response Transformation

The gateway can modify requests before they reach services and modify responses before they reach clients.

  Request Transformation:
  
  Client sends:
  POST /api/orders
  Authorization: Bearer eyJhbG...
  Content-Type: application/json
  {"item": "laptop", "qty": 1}
  
  Gateway transforms to:
  POST /internal/orders/create
  X-User-Id: 42
  X-Request-Id: uuid-abc-123
  X-Forwarded-For: 203.0.113.5
  Content-Type: application/json
  {"item": "laptop", "qty": 1, "user_id": 42, "timestamp": "2026-04-11T..."}
  
  
  Response Transformation:
  
  Service returns:
  {"user_id": 42, "internal_score": 85, "name": "Alice", ...}
  
  Gateway transforms to (remove internal fields):
  {"name": "Alice", ...}

Common transformations:

Transformation	Example
Header injection	Add X-Request-Id, X-Forwarded-For
Header removal	Strip internal headers from responses
Protocol translation	REST to gRPC, HTTP to WebSocket
Payload modification	Add fields, remove sensitive data
Response filtering	Return only requested fields
Format conversion	XML to JSON, JSON to Protocol Buffers

5. Request Aggregation (API Composition)

The gateway can combine responses from multiple services into a single response.

  Without Aggregation (client makes 3 calls):
  
  Mobile App --> GET /api/users/42        --> User Service
  Mobile App --> GET /api/users/42/orders --> Order Service
  Mobile App --> GET /api/users/42/recommendations --> Recommendation Service
  
  3 round trips from mobile! Slow on cellular networks.
  
  
  With Aggregation (client makes 1 call):
  
  Mobile App --> GET /api/users/42/dashboard --> API Gateway
  
  API Gateway internally:
    +--> User Service: GET /users/42
    +--> Order Service: GET /users/42/orders     (parallel)
    +--> Recommendation Service: GET /users/42/recs (parallel)
    
  Gateway combines results:
  {
    "user": {"name": "Alice", ...},
    "recent_orders": [...],
    "recommendations": [...]
  }
  
  1 round trip from mobile. Much faster.

This is especially valuable for:

Mobile clients (high latency, limited bandwidth)
BFF (Backend for Frontend) pattern
Reducing over-fetching and under-fetching

6. Other Gateway Responsibilities

Responsibility	Description
SSL/TLS termination	Decrypt HTTPS at the gateway; internal traffic can be HTTP
Logging and monitoring	Centralized request/response logging
Circuit breaking	Stop sending requests to failing services
Caching	Cache GET responses to reduce backend load
CORS handling	Manage cross-origin resource sharing headers
IP whitelisting/blacklisting	Block or allow specific IPs
Request validation	Validate request schema before forwarding
Compression	gzip/brotli responses
Retry logic	Retry failed requests with exponential backoff
Load shedding	Reject low-priority requests under high load

Popular API Gateways

Gateway	Type	Best For	Key Features
Kong	Open source / Enterprise	General purpose	Plugin ecosystem, Lua/Go plugins, DB-less mode
AWS API Gateway	Cloud-managed	AWS ecosystems	Lambda integration, WebSocket, REST & HTTP APIs
Nginx	Open source	High performance	Reverse proxy, extensive config, Lua scripting
Envoy	Open source	Service mesh	gRPC-native, observability, xDS API
Traefik	Open source	Container-native	Auto-discovery, Docker/K8s integration
Apigee (Google)	Cloud-managed	Enterprise API management	Analytics, developer portal, monetization
Azure API Management	Cloud-managed	Azure ecosystems	Policy engine, developer portal
Spring Cloud Gateway	Framework	Java/Spring ecosystems	Java-native, reactive, Spring integration
Zuul (Netflix)	Open source	JVM ecosystems	Filters, dynamic routing, Netflix battle-tested

API Gateway vs Load Balancer

This is a common interview question. They overlap but serve different purposes.

  API Gateway                         Load Balancer
  ─────────────────────────────────────────────────────
  Application-level concerns          Traffic distribution
  Auth, rate limiting, transformation  Health checks, routing
  Single entry point for clients      Distributes to server pool
  Understands API semantics           Protocol-level routing
  Often L7 only                       L4 or L7
  
  
  In practice, they work TOGETHER:
  
  Client --> API Gateway --> Load Balancer --> Service Instances
                  |              |
           Auth, routing    Distribution
           Rate limiting    Health checks
           Transformation   Failover

Feature	API Gateway	Load Balancer
Primary purpose	API management	Traffic distribution
Authentication	Yes	No
Rate limiting	Yes	No (typically)
Request transformation	Yes	No
API composition	Yes	No
Health checks	Sometimes	Yes (core feature)
SSL termination	Yes	Yes (L7)
Content routing	Yes (rich)	Yes (basic, L7)
Protocol translation	Yes	No
Caching	Sometimes	No

Key interview answer: "An API gateway manages API concerns (auth, rate limiting, transformation). A load balancer distributes traffic across server instances. In most architectures, you use both: the gateway handles cross-cutting concerns, then forwards to a load balancer that distributes across service instances."

API Gateway in Microservices Architecture

  Microservices Architecture with API Gateway:
  
  +------------------------------------------------------------------+
  |  Clients                                                          |
  |  +--------+  +---------+  +----------+                           |
  |  | Mobile |  |   Web   |  | Partner  |                           |
  |  |  App   |  |   App   |  |   API    |                           |
  |  +---+----+  +----+----+  +----+-----+                           |
  |      |            |            |                                  |
  |      +------+-----+-----+-----+                                  |
  |             |                                                     |
  |             v                                                     |
  |      +------+------+                                              |
  |      | API Gateway |  Auth, Rate Limit, Route, Transform         |
  |      +------+------+                                              |
  |             |                                                     |
  |    +--------+--------+-----------+                                |
  |    |        |        |           |                                |
  |    v        v        v           v                                |
  |  +----+  +-----+  +-------+  +--------+                          |
  |  |User|  |Order|  |Product|  |Payment |                          |
  |  |Svc |  |Svc  |  |Svc    |  |Svc     |                          |
  |  +----+  +-----+  +-------+  +--------+                          |
  +------------------------------------------------------------------+

Backend for Frontend (BFF) Pattern

Different clients need different API shapes. Instead of one gateway for all, create a specialized gateway per client type.

  BFF Pattern:
  
  Mobile App --> Mobile BFF Gateway --> Services
  Web App    --> Web BFF Gateway    --> Services
  Partner    --> Partner API Gateway --> Services
  
  Each BFF:
  - Aggregates data optimized for its client
  - Returns only fields the client needs
  - Handles client-specific auth (e.g., API keys for partners)
  
  
  Mobile BFF:
  GET /dashboard --> Aggregates user + orders + recommendations
                     Returns compact JSON (mobile bandwidth)
  
  Web BFF:
  GET /dashboard --> Aggregates user + orders + recommendations + analytics
                     Returns full JSON (desktop bandwidth)

Gateway Design Considerations

1. Single Point of Failure

The gateway is on the critical path. If it goes down, everything goes down.

Mitigations:

Deploy multiple gateway instances behind a load balancer
Use cloud-managed gateways (AWS API Gateway auto-scales)
Implement circuit breakers and graceful degradation

2. Latency Overhead

Every request passes through the gateway, adding latency.

  Without gateway: Client --> Service (10ms)
  With gateway:    Client --> Gateway (2ms) --> Service (10ms) = 12ms
  
  Overhead is typically 1-5ms. Acceptable for most use cases.

Mitigations:

Keep gateway logic lightweight
Avoid heavy transformations in the gateway
Cache frequent responses at the gateway

3. Gateway Bloat

Over time, too much logic migrates to the gateway.

Warning signs:

Business logic in the gateway (should be in services)
Complex orchestration (should be a dedicated service)
Gateway config file is thousands of lines

Rule of thumb: The gateway handles cross-cutting infrastructure concerns. Business logic belongs in services.

4. Configuration Management

  Gateway configuration approaches:
  
  1. Static config (Nginx, HAProxy):
     - Config file checked into VCS
     - Reload on deployment
     
  2. Dynamic config (Kong, Envoy):
     - Config stored in database or control plane
     - Changes take effect without restart
     
  3. Code-based (Spring Cloud Gateway):
     - Routes defined in application code
     - Full programming language flexibility

Gateway Security Best Practices

Practice	Implementation
Always terminate SSL at gateway	Internal traffic can be HTTP (encrypted in VPC)
Validate all input at gateway	Schema validation, size limits, content type checks
Rate limit aggressively	Per-user, per-IP, per-endpoint
Never expose internal URLs	Gateway rewrites paths; clients see only public URLs
Log all requests	Request ID, user ID, status, latency, response size
Implement CORS properly	Whitelist origins, methods, headers at gateway
Use circuit breakers	Prevent cascading failures when services are down
Sanitize responses	Strip internal headers, stack traces, debug info

Key Takeaways

API Gateway = single entry point for all client-to-backend communication
The gateway handles cross-cutting concerns: auth, rate limiting, routing, transformation
API composition at the gateway reduces round trips for mobile clients
Gateway is NOT a load balancer -- they complement each other
BFF pattern creates client-specific gateways for different frontends
Beware gateway bloat -- keep business logic in services, not the gateway
Gateway must be highly available -- it is the single point of failure
In interviews, mention the gateway as the front door of your microservices architecture

Explain-It Challenge

"You are designing an API for a ride-sharing app. You have separate services for riders, drivers, trips, payments, and notifications. The mobile app needs a single 'request ride' flow that touches all five services. Your partner API (for corporate accounts) has stricter rate limits and different authentication. Design the API gateway layer, including routing, authentication, rate limiting, and how the 'request ride' call is orchestrated."