Episode 9 — System Design / 9.9 — Core Infrastructure

9.9.d API Gateway

What API Gateways Do

An API Gateway is a single entry point for all client requests to a backend system. It acts as a reverse proxy that routes requests to appropriate services while handling cross-cutting concerns like authentication, rate limiting, and request transformation.

  WITHOUT API Gateway:
  
  Mobile App --> User Service     (auth, rate limiting in each service)
  Mobile App --> Order Service    (auth, rate limiting in each service)
  Mobile App --> Payment Service  (auth, rate limiting in each service)
  Web App    --> User Service     (duplicated logic everywhere)
  Web App    --> Order Service
  
  Problems: Client must know all service URLs, auth duplicated, no central control
  
  
  WITH API Gateway:
  
  Mobile App ---+
                |
  Web App ------+--> API Gateway --> User Service
                |         |      --> Order Service
  Partner API --+         |      --> Payment Service
                          |
                    Handles: Auth, Rate Limiting, Routing,
                    Logging, Transformation, Caching

Core Responsibilities

1. Request Routing

The gateway maps external URLs to internal service endpoints.

  Routing Table:
  
  External URL                    Internal Service
  ─────────────────────────────────────────────────
  GET  /api/users/*            -> user-service:8080/users/*
  POST /api/orders/*           -> order-service:8081/orders/*
  GET  /api/products/*         -> product-service:8082/products/*
  POST /api/payments/*         -> payment-service:8083/payments/*
  
  
  Example:
  Client: GET https://api.example.com/api/users/42
  Gateway routes to: http://user-service:8080/users/42

Path-based routing:

  /api/v1/* --> Service Pool A (version 1)
  /api/v2/* --> Service Pool B (version 2)

Header-based routing:

  X-Client-Type: mobile  --> Mobile-optimized service
  X-Client-Type: web     --> Web service

Weight-based routing (canary deployments):

  /api/users/* --> Service v1 (90% of traffic)
                --> Service v2 (10% of traffic)

2. Authentication and Authorization

The gateway validates identity and permissions before requests reach backend services.

  Authentication Flow:
  
  Client                  API Gateway              Auth Service        Backend
    |                         |                         |                 |
    |-- Request + JWT ------->|                         |                 |
    |                         |-- Validate JWT -------->|                 |
    |                         |<-- Valid, user=42 ------|                 |
    |                         |                         |                 |
    |                         |-- Forward request ----->|---------------->|
    |                         |   + X-User-Id: 42      |                 |
    |                         |   + X-Roles: admin     |                 |
    |<-- Response ------------|<------------------------|<----------------|
    
  Backend services TRUST the gateway's headers.
  They never validate JWT themselves.

Common auth patterns at the gateway:

PatternHow It WorksUse Case
JWT validationGateway verifies JWT signature and claimsStateless auth
OAuth2 token introspectionGateway calls auth server to validate opaque tokenThird-party tokens
API key validationGateway checks API key against a registryPartner/developer APIs
mTLSGateway verifies client certificateService-to-service, B2B
Basic AuthGateway checks username/passwordSimple internal APIs

3. Rate Limiting

The gateway controls how many requests a client can make in a given time window.

  Rate Limiting Example:
  
  Plan: Free tier = 100 requests/minute
  
  Request  1:  200 OK  (remaining: 99)
  Request  2:  200 OK  (remaining: 98)
  ...
  Request 100: 200 OK  (remaining: 0)
  Request 101: 429 Too Many Requests
               Retry-After: 30
  
  Response headers:
  X-RateLimit-Limit: 100
  X-RateLimit-Remaining: 0
  X-RateLimit-Reset: 1680000060

Rate limiting algorithms:

AlgorithmDescriptionProsCons
Fixed WindowCount requests in fixed time windows (e.g., per minute)SimpleBurst at window boundary
Sliding Window LogTrack timestamps of all requestsAccurateMemory-intensive
Sliding Window CounterHybrid of fixed window + weighted overlapGood balanceSlightly approximate
Token BucketTokens added at fixed rate; request consumes a tokenAllows burstsMore complex
Leaky BucketRequests processed at fixed rate; excess queued/droppedSmooth outputNo burst handling
  Token Bucket:
  
  Bucket capacity: 10 tokens
  Refill rate: 2 tokens/second
  
  t=0s:  [##########]  10 tokens (full)
  t=0s:  3 requests --> [#######]  7 tokens
  t=1s:  refill +2  --> [#########]  9 tokens
  t=1s:  5 requests --> [####]  4 tokens
  t=2s:  refill +2  --> [######]  6 tokens
  
  If bucket empty: reject request (429)

Rate limiting dimensions:

  • Per user/API key
  • Per IP address
  • Per endpoint
  • Per service/tenant
  • Global (total system throughput)

4. Request/Response Transformation

The gateway can modify requests before they reach services and modify responses before they reach clients.

  Request Transformation:
  
  Client sends:
  POST /api/orders
  Authorization: Bearer eyJhbG...
  Content-Type: application/json
  {"item": "laptop", "qty": 1}
  
  Gateway transforms to:
  POST /internal/orders/create
  X-User-Id: 42
  X-Request-Id: uuid-abc-123
  X-Forwarded-For: 203.0.113.5
  Content-Type: application/json
  {"item": "laptop", "qty": 1, "user_id": 42, "timestamp": "2026-04-11T..."}
  
  
  Response Transformation:
  
  Service returns:
  {"user_id": 42, "internal_score": 85, "name": "Alice", ...}
  
  Gateway transforms to (remove internal fields):
  {"name": "Alice", ...}

Common transformations:

TransformationExample
Header injectionAdd X-Request-Id, X-Forwarded-For
Header removalStrip internal headers from responses
Protocol translationREST to gRPC, HTTP to WebSocket
Payload modificationAdd fields, remove sensitive data
Response filteringReturn only requested fields
Format conversionXML to JSON, JSON to Protocol Buffers

5. Request Aggregation (API Composition)

The gateway can combine responses from multiple services into a single response.

  Without Aggregation (client makes 3 calls):
  
  Mobile App --> GET /api/users/42        --> User Service
  Mobile App --> GET /api/users/42/orders --> Order Service
  Mobile App --> GET /api/users/42/recommendations --> Recommendation Service
  
  3 round trips from mobile! Slow on cellular networks.
  
  
  With Aggregation (client makes 1 call):
  
  Mobile App --> GET /api/users/42/dashboard --> API Gateway
  
  API Gateway internally:
    +--> User Service: GET /users/42
    +--> Order Service: GET /users/42/orders     (parallel)
    +--> Recommendation Service: GET /users/42/recs (parallel)
    
  Gateway combines results:
  {
    "user": {"name": "Alice", ...},
    "recent_orders": [...],
    "recommendations": [...]
  }
  
  1 round trip from mobile. Much faster.

This is especially valuable for:

  • Mobile clients (high latency, limited bandwidth)
  • BFF (Backend for Frontend) pattern
  • Reducing over-fetching and under-fetching

6. Other Gateway Responsibilities

ResponsibilityDescription
SSL/TLS terminationDecrypt HTTPS at the gateway; internal traffic can be HTTP
Logging and monitoringCentralized request/response logging
Circuit breakingStop sending requests to failing services
CachingCache GET responses to reduce backend load
CORS handlingManage cross-origin resource sharing headers
IP whitelisting/blacklistingBlock or allow specific IPs
Request validationValidate request schema before forwarding
Compressiongzip/brotli responses
Retry logicRetry failed requests with exponential backoff
Load sheddingReject low-priority requests under high load

Popular API Gateways

GatewayTypeBest ForKey Features
KongOpen source / EnterpriseGeneral purposePlugin ecosystem, Lua/Go plugins, DB-less mode
AWS API GatewayCloud-managedAWS ecosystemsLambda integration, WebSocket, REST & HTTP APIs
NginxOpen sourceHigh performanceReverse proxy, extensive config, Lua scripting
EnvoyOpen sourceService meshgRPC-native, observability, xDS API
TraefikOpen sourceContainer-nativeAuto-discovery, Docker/K8s integration
Apigee (Google)Cloud-managedEnterprise API managementAnalytics, developer portal, monetization
Azure API ManagementCloud-managedAzure ecosystemsPolicy engine, developer portal
Spring Cloud GatewayFrameworkJava/Spring ecosystemsJava-native, reactive, Spring integration
Zuul (Netflix)Open sourceJVM ecosystemsFilters, dynamic routing, Netflix battle-tested

API Gateway vs Load Balancer

This is a common interview question. They overlap but serve different purposes.

  API Gateway                         Load Balancer
  ─────────────────────────────────────────────────────
  Application-level concerns          Traffic distribution
  Auth, rate limiting, transformation  Health checks, routing
  Single entry point for clients      Distributes to server pool
  Understands API semantics           Protocol-level routing
  Often L7 only                       L4 or L7
  
  
  In practice, they work TOGETHER:
  
  Client --> API Gateway --> Load Balancer --> Service Instances
                  |              |
           Auth, routing    Distribution
           Rate limiting    Health checks
           Transformation   Failover
FeatureAPI GatewayLoad Balancer
Primary purposeAPI managementTraffic distribution
AuthenticationYesNo
Rate limitingYesNo (typically)
Request transformationYesNo
API compositionYesNo
Health checksSometimesYes (core feature)
SSL terminationYesYes (L7)
Content routingYes (rich)Yes (basic, L7)
Protocol translationYesNo
CachingSometimesNo

Key interview answer: "An API gateway manages API concerns (auth, rate limiting, transformation). A load balancer distributes traffic across server instances. In most architectures, you use both: the gateway handles cross-cutting concerns, then forwards to a load balancer that distributes across service instances."


API Gateway in Microservices Architecture

  Microservices Architecture with API Gateway:
  
  +------------------------------------------------------------------+
  |  Clients                                                          |
  |  +--------+  +---------+  +----------+                           |
  |  | Mobile |  |   Web   |  | Partner  |                           |
  |  |  App   |  |   App   |  |   API    |                           |
  |  +---+----+  +----+----+  +----+-----+                           |
  |      |            |            |                                  |
  |      +------+-----+-----+-----+                                  |
  |             |                                                     |
  |             v                                                     |
  |      +------+------+                                              |
  |      | API Gateway |  Auth, Rate Limit, Route, Transform         |
  |      +------+------+                                              |
  |             |                                                     |
  |    +--------+--------+-----------+                                |
  |    |        |        |           |                                |
  |    v        v        v           v                                |
  |  +----+  +-----+  +-------+  +--------+                          |
  |  |User|  |Order|  |Product|  |Payment |                          |
  |  |Svc |  |Svc  |  |Svc    |  |Svc     |                          |
  |  +----+  +-----+  +-------+  +--------+                          |
  +------------------------------------------------------------------+

Backend for Frontend (BFF) Pattern

Different clients need different API shapes. Instead of one gateway for all, create a specialized gateway per client type.

  BFF Pattern:
  
  Mobile App --> Mobile BFF Gateway --> Services
  Web App    --> Web BFF Gateway    --> Services
  Partner    --> Partner API Gateway --> Services
  
  Each BFF:
  - Aggregates data optimized for its client
  - Returns only fields the client needs
  - Handles client-specific auth (e.g., API keys for partners)
  
  
  Mobile BFF:
  GET /dashboard --> Aggregates user + orders + recommendations
                     Returns compact JSON (mobile bandwidth)
  
  Web BFF:
  GET /dashboard --> Aggregates user + orders + recommendations + analytics
                     Returns full JSON (desktop bandwidth)

Gateway Design Considerations

1. Single Point of Failure

The gateway is on the critical path. If it goes down, everything goes down.

Mitigations:

  • Deploy multiple gateway instances behind a load balancer
  • Use cloud-managed gateways (AWS API Gateway auto-scales)
  • Implement circuit breakers and graceful degradation

2. Latency Overhead

Every request passes through the gateway, adding latency.

  Without gateway: Client --> Service (10ms)
  With gateway:    Client --> Gateway (2ms) --> Service (10ms) = 12ms
  
  Overhead is typically 1-5ms. Acceptable for most use cases.

Mitigations:

  • Keep gateway logic lightweight
  • Avoid heavy transformations in the gateway
  • Cache frequent responses at the gateway

3. Gateway Bloat

Over time, too much logic migrates to the gateway.

Warning signs:

  • Business logic in the gateway (should be in services)
  • Complex orchestration (should be a dedicated service)
  • Gateway config file is thousands of lines

Rule of thumb: The gateway handles cross-cutting infrastructure concerns. Business logic belongs in services.

4. Configuration Management

  Gateway configuration approaches:
  
  1. Static config (Nginx, HAProxy):
     - Config file checked into VCS
     - Reload on deployment
     
  2. Dynamic config (Kong, Envoy):
     - Config stored in database or control plane
     - Changes take effect without restart
     
  3. Code-based (Spring Cloud Gateway):
     - Routes defined in application code
     - Full programming language flexibility

Gateway Security Best Practices

PracticeImplementation
Always terminate SSL at gatewayInternal traffic can be HTTP (encrypted in VPC)
Validate all input at gatewaySchema validation, size limits, content type checks
Rate limit aggressivelyPer-user, per-IP, per-endpoint
Never expose internal URLsGateway rewrites paths; clients see only public URLs
Log all requestsRequest ID, user ID, status, latency, response size
Implement CORS properlyWhitelist origins, methods, headers at gateway
Use circuit breakersPrevent cascading failures when services are down
Sanitize responsesStrip internal headers, stack traces, debug info

Key Takeaways

  1. API Gateway = single entry point for all client-to-backend communication
  2. The gateway handles cross-cutting concerns: auth, rate limiting, routing, transformation
  3. API composition at the gateway reduces round trips for mobile clients
  4. Gateway is NOT a load balancer -- they complement each other
  5. BFF pattern creates client-specific gateways for different frontends
  6. Beware gateway bloat -- keep business logic in services, not the gateway
  7. Gateway must be highly available -- it is the single point of failure
  8. In interviews, mention the gateway as the front door of your microservices architecture

Explain-It Challenge

"You are designing an API for a ride-sharing app. You have separate services for riders, drivers, trips, payments, and notifications. The mobile app needs a single 'request ride' flow that touches all five services. Your partner API (for corporate accounts) has stricter rate limits and different authentication. Design the API gateway layer, including routing, authentication, rate limiting, and how the 'request ride' call is orchestrated."