Episode 6 — Scaling Reliability Microservices Web3 / 6.1 — Microservice Foundations

6.1.e -- Service Communication Patterns

How services talk to each other determines the reliability, performance, and coupling of your entire system. Choose the wrong communication pattern and you build a distributed monolith. Choose wisely and you unlock true independence.


Navigation << 6.1.d Database per Service | 6.1.e Service Communication Patterns | Exercise Questions >>


1. The Two Fundamental Styles

Synchronous (Request-Response):
  Client --> [Service A] --request--> [Service B]
                         <--response--
  Client waits. Service A waits. Everyone waits.

Asynchronous (Event-Driven):
  Client --> [Service A] --event--> [Message Broker] --event--> [Service B]
  Client gets immediate response.
  Service B processes when ready.
DimensionSynchronousAsynchronous
Client waits?YesNo (fire and forget)
CouplingTemporal (both must be running)Decoupled (producer does not know consumer)
LatencySum of all hopsImmediate response + eventual processing
Error handlingImmediate (HTTP status codes)Deferred (dead-letter queues, retries)
ComplexityLower (familiar HTTP)Higher (message brokers, idempotency)
ThroughputLimited by slowest serviceHigh (buffer in broker)
Best forQueries, real-time needsCommands, background processing, events

2. Synchronous Communication

2.1 REST (HTTP/JSON)

The most common synchronous pattern. Services expose HTTP endpoints and clients call them with standard HTTP verbs.

// user-service/index.js -- REST API
const express = require('express');
const app = express();
app.use(express.json());

app.get('/users/:id', async (req, res) => {
  const user = await db.query('SELECT * FROM users WHERE id = $1', [req.params.id]);
  if (!user.rows[0]) return res.status(404).json({ error: 'User not found' });
  res.json(user.rows[0]);
});

app.post('/users', async (req, res) => {
  const { name, email } = req.body;
  const result = await db.query(
    'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING *',
    [name, email]
  );
  res.status(201).json(result.rows[0]);
});

app.listen(3001);
// order-service/clients/userClient.js -- calling User Service via REST
const axios = require('axios');

const USER_SERVICE_URL = process.env.USER_SERVICE_URL || 'http://localhost:3001';

class UserClient {
  async getUser(userId) {
    try {
      const response = await axios.get(`${USER_SERVICE_URL}/users/${userId}`, {
        timeout: 3000,  // 3-second timeout -- never wait forever
      });
      return response.data;
    } catch (err) {
      if (err.response?.status === 404) {
        return null; // User not found -- handle gracefully
      }
      throw new Error(`User Service unavailable: ${err.message}`);
    }
  }

  async getUsersBatch(userIds) {
    // Batch call to avoid N+1 problem
    const response = await axios.post(`${USER_SERVICE_URL}/users/batch`, {
      ids: userIds,
    }, { timeout: 5000 });
    return response.data;
  }
}

module.exports = new UserClient();

2.2 REST Best Practices Between Services

// 1. ALWAYS set timeouts
const response = await axios.get(url, { timeout: 3000 });

// 2. Use retry with exponential backoff
const axiosRetry = require('axios-retry');

const client = axios.create({ timeout: 3000 });
axiosRetry(client, {
  retries: 3,
  retryDelay: (retryCount) => {
    return Math.pow(2, retryCount) * 1000; // 1s, 2s, 4s
  },
  retryCondition: (error) => {
    // Only retry on network errors or 5xx
    return axiosRetry.isNetworkOrIdempotentRequestError(error)
      || error.response?.status >= 500;
  },
});

// 3. Use circuit breaker to avoid cascading failures
const CircuitBreaker = require('opossum');

function getUserCircuit(userId) {
  return axios.get(`${USER_SERVICE_URL}/users/${userId}`, { timeout: 3000 });
}

const breaker = new CircuitBreaker(getUserCircuit, {
  timeout: 5000,       // If function takes longer than 5s, trip
  errorThresholdPercentage: 50,  // Trip if 50% of requests fail
  resetTimeout: 10000, // Try again after 10s
});

breaker.fallback((userId) => {
  // Return cached data or a default
  return { id: userId, name: 'Unknown', cached: true };
});

// Usage
const user = await breaker.fire(userId);

2.3 gRPC

gRPC uses Protocol Buffers (binary format) over HTTP/2. It is faster than REST/JSON but requires more setup.

// user.proto -- Protocol Buffer definition
syntax = "proto3";

package user;

service UserService {
  rpc GetUser (GetUserRequest) returns (UserResponse);
  rpc CreateUser (CreateUserRequest) returns (UserResponse);
  rpc GetUsersBatch (GetUsersBatchRequest) returns (UsersBatchResponse);
}

message GetUserRequest {
  string id = 1;
}

message CreateUserRequest {
  string name = 1;
  string email = 2;
}

message UserResponse {
  string id = 1;
  string name = 2;
  string email = 3;
  string created_at = 4;
}

message GetUsersBatchRequest {
  repeated string ids = 1;
}

message UsersBatchResponse {
  repeated UserResponse users = 1;
}
// user-service/grpcServer.js
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');

const packageDef = protoLoader.loadSync('./user.proto');
const userProto = grpc.loadPackageDefinition(packageDef).user;

const server = new grpc.Server();

server.addService(userProto.UserService.service, {
  GetUser: async (call, callback) => {
    const user = await db.query('SELECT * FROM users WHERE id = $1', [call.request.id]);
    if (!user.rows[0]) {
      return callback({ code: grpc.status.NOT_FOUND, message: 'User not found' });
    }
    callback(null, user.rows[0]);
  },

  CreateUser: async (call, callback) => {
    const { name, email } = call.request;
    const result = await db.query(
      'INSERT INTO users (name, email) VALUES ($1, $2) RETURNING *',
      [name, email]
    );
    callback(null, result.rows[0]);
  },

  GetUsersBatch: async (call, callback) => {
    const result = await db.query('SELECT * FROM users WHERE id = ANY($1)', [call.request.ids]);
    callback(null, { users: result.rows });
  },
});

server.bindAsync('0.0.0.0:50051', grpc.ServerCredentials.createInsecure(), () => {
  console.log('gRPC User Service on :50051');
});
// order-service/clients/userGrpcClient.js
const grpc = require('@grpc/grpc-js');
const protoLoader = require('@grpc/proto-loader');

const packageDef = protoLoader.loadSync('./user.proto');
const userProto = grpc.loadPackageDefinition(packageDef).user;

const client = new userProto.UserService(
  process.env.USER_GRPC_URL || 'localhost:50051',
  grpc.credentials.createInsecure()
);

// Promisify for async/await usage
function getUser(id) {
  return new Promise((resolve, reject) => {
    client.GetUser({ id }, (err, response) => {
      if (err) return reject(err);
      resolve(response);
    });
  });
}

function getUsersBatch(ids) {
  return new Promise((resolve, reject) => {
    client.GetUsersBatch({ ids }, (err, response) => {
      if (err) return reject(err);
      resolve(response.users);
    });
  });
}

module.exports = { getUser, getUsersBatch };

2.4 REST vs gRPC Comparison

DimensionREST (HTTP/JSON)gRPC (HTTP/2 + Protobuf)
FormatJSON (text, human-readable)Protobuf (binary, compact)
SpeedSlower (text parsing)Faster (binary serialisation)
StreamingLimited (SSE, WebSockets separate)Built-in bidirectional streaming
ContractOpenAPI/Swagger (optional).proto files (required, strict)
Browser supportNativeRequires gRPC-Web proxy
DebuggingEasy (curl, Postman)Harder (need gRPC tools)
AdoptionUniversalGrowing, strong in internal services
Best forExternal APIs, simple servicesInternal service-to-service, high-throughput

3. Asynchronous Communication

3.1 Message Queues (Point-to-Point)

A producer sends a message to a queue. One consumer processes it. The message is removed after processing.

Producer --> [Queue] --> Consumer

Example: "Process this payment" --> [payment-queue] --> Payment Worker
// order-service/publisher.js -- Publishing to RabbitMQ queue
const amqp = require('amqplib');

let channel;

async function connect() {
  const conn = await amqp.connect(process.env.RABBITMQ_URL || 'amqp://localhost');
  channel = await conn.createChannel();
  await channel.assertQueue('payment-processing', { durable: true });
  await channel.assertQueue('email-notifications', { durable: true });
}

async function requestPayment(orderData) {
  channel.sendToQueue(
    'payment-processing',
    Buffer.from(JSON.stringify({
      orderId: orderData.id,
      userId: orderData.userId,
      amount: orderData.totalAmount,
      timestamp: new Date().toISOString(),
    })),
    { persistent: true } // survive broker restart
  );
}

async function requestEmailNotification(emailData) {
  channel.sendToQueue(
    'email-notifications',
    Buffer.from(JSON.stringify(emailData)),
    { persistent: true }
  );
}

module.exports = { connect, requestPayment, requestEmailNotification };
// payment-service/worker.js -- Consuming from RabbitMQ queue
const amqp = require('amqplib');

async function startWorker() {
  const conn = await amqp.connect(process.env.RABBITMQ_URL || 'amqp://localhost');
  const channel = await conn.createChannel();

  await channel.assertQueue('payment-processing', { durable: true });
  channel.prefetch(1); // Process one message at a time

  console.log('Payment worker waiting for messages...');

  channel.consume('payment-processing', async (msg) => {
    const payment = JSON.parse(msg.content.toString());
    console.log(`Processing payment for order ${payment.orderId}: $${payment.amount}`);

    try {
      // Process payment (call Stripe, etc.)
      const result = await processPayment(payment);

      if (result.success) {
        // Publish success event
        channel.sendToQueue(
          'order-updates',
          Buffer.from(JSON.stringify({
            orderId: payment.orderId,
            status: 'paid',
            paymentId: result.paymentId,
          })),
          { persistent: true }
        );
      } else {
        // Publish failure event
        channel.sendToQueue(
          'order-updates',
          Buffer.from(JSON.stringify({
            orderId: payment.orderId,
            status: 'payment_failed',
            reason: result.error,
          })),
          { persistent: true }
        );
      }

      channel.ack(msg); // Acknowledge -- message is removed from queue
    } catch (err) {
      console.error('Payment processing error:', err);
      // Negative acknowledge -- requeue for retry
      channel.nack(msg, false, true);
    }
  });
}

startWorker();

3.2 Events / Pub-Sub (One-to-Many)

A producer publishes an event to a topic. Multiple consumers can subscribe and react independently.

Producer --> [Topic: order.created] --> Consumer A (Inventory)
                                   --> Consumer B (Notification)
                                   --> Consumer C (Analytics)

Each consumer gets a copy of the event.
They process independently. They do not know about each other.
// order-service/events.js -- Publishing events to RabbitMQ exchange (pub/sub)
const amqp = require('amqplib');

let channel;

async function connect() {
  const conn = await amqp.connect(process.env.RABBITMQ_URL || 'amqp://localhost');
  channel = await conn.createChannel();
  // 'topic' exchange allows routing by pattern
  await channel.assertExchange('domain-events', 'topic', { durable: true });
}

async function publish(eventType, data) {
  const event = {
    eventId: `evt_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`,
    eventType,
    timestamp: new Date().toISOString(),
    data,
  };

  channel.publish(
    'domain-events',
    eventType, // routing key, e.g., "order.created"
    Buffer.from(JSON.stringify(event)),
    { persistent: true }
  );

  console.log(`Published event: ${eventType}`);
}

module.exports = { connect, publish };

// Usage in order route:
// await publish('order.created', { orderId: order.id, userId, items, total });
// inventory-service/eventConsumer.js -- Subscribing to events
const amqp = require('amqplib');

async function subscribeToOrderEvents() {
  const conn = await amqp.connect(process.env.RABBITMQ_URL || 'amqp://localhost');
  const channel = await conn.createChannel();

  await channel.assertExchange('domain-events', 'topic', { durable: true });

  // Each service gets its own queue (durable, survives restarts)
  const q = await channel.assertQueue('inventory-order-events', { durable: true });

  // Bind to specific event patterns
  await channel.bindQueue(q.queue, 'domain-events', 'order.created');
  await channel.bindQueue(q.queue, 'domain-events', 'order.cancelled');

  channel.consume(q.queue, async (msg) => {
    const event = JSON.parse(msg.content.toString());

    switch (event.eventType) {
      case 'order.created':
        await reserveInventory(event.data);
        break;
      case 'order.cancelled':
        await releaseInventory(event.data);
        break;
    }

    channel.ack(msg);
  });

  console.log('Inventory Service subscribed to order events');
}
// notification-service/eventConsumer.js -- Another subscriber to the SAME events
async function subscribeToEvents() {
  const conn = await amqp.connect(process.env.RABBITMQ_URL || 'amqp://localhost');
  const channel = await conn.createChannel();

  await channel.assertExchange('domain-events', 'topic', { durable: true });

  const q = await channel.assertQueue('notification-events', { durable: true });

  // Subscribe to multiple event types
  await channel.bindQueue(q.queue, 'domain-events', 'order.*');     // all order events
  await channel.bindQueue(q.queue, 'domain-events', 'payment.*');   // all payment events
  await channel.bindQueue(q.queue, 'domain-events', 'user.created');

  channel.consume(q.queue, async (msg) => {
    const event = JSON.parse(msg.content.toString());

    switch (event.eventType) {
      case 'order.created':
        await sendEmail(event.data.userId, 'Order Confirmation', `Order ${event.data.orderId} received!`);
        break;
      case 'payment.succeeded':
        await sendEmail(event.data.userId, 'Payment Received', `Payment of $${event.data.amount} confirmed.`);
        break;
      case 'user.created':
        await sendEmail(event.data.email, 'Welcome!', 'Thanks for signing up.');
        break;
    }

    channel.ack(msg);
  });

  console.log('Notification Service subscribed to events');
}

4. Message Brokers: RabbitMQ vs Kafka

DimensionRabbitMQApache Kafka
ModelMessage queue (traditional)Distributed log (append-only)
DeliveryPush to consumersConsumers pull from log
Message retentionDeleted after acknowledgementRetained for configurable duration
OrderingPer queuePer partition
ThroughputThousands/secMillions/sec
ReplayNot possible (message gone after ack)Possible (consumers can re-read the log)
Consumer groupsCompeting consumers on one queueConsumer groups with partition assignment
Best forTask queues, RPC, routingEvent streaming, event sourcing, high throughput
ComplexityLowerHigher (ZooKeeper/KRaft, partitions, offsets)
RabbitMQ mental model:
  Queue is a mailbox. Message goes in, one consumer takes it out. Gone.

Kafka mental model:
  Log is a newspaper archive. Events are appended. Consumers read at their own pace.
  Old events are still there for replay.

5. When to Use Each Pattern

5.1 Decision Guide

+----------------------------------------------------+
|  "Does the caller NEED the result right now?"       |
|                                                     |
|  YES --> Synchronous (REST or gRPC)                 |
|    |                                                |
|    +-- "Is it internal service-to-service?"         |
|    |     YES and high throughput --> gRPC            |
|    |     YES and simple --> REST                     |
|    |     NO (external/browser) --> REST              |
|                                                     |
|  NO --> Asynchronous                                |
|    |                                                |
|    +-- "Does exactly ONE consumer process this?"    |
|    |     YES --> Message Queue (point-to-point)     |
|    |                                                |
|    +-- "Do MULTIPLE consumers need this event?"     |
|          YES --> Pub/Sub (topic/exchange)            |
+----------------------------------------------------+

5.2 Common Scenarios

ScenarioPatternWhy
User clicks "View Profile"Sync RESTNeed immediate response to render UI
User places an orderSync REST + Async eventsImmediate acknowledgment, then async processing
Resize uploaded imageAsync queueHeavy processing; user does not wait
Notify multiple services of a state changePub/sub eventsMultiple consumers, decoupled
Real-time analytics pipelineKafka streamingHigh throughput, ordered, replayable
Request credit score from external APISync REST with circuit breakerExternal dependency; need the result now
Nightly batch reportAsync queueScheduled, no real-time need
Sync search index after product updateAsync eventEventual consistency is fine for search

6. Latency Implications

6.1 Synchronous Call Chains

Client --> Gateway --> Service A --> Service B --> Service C
                                                      |
Total latency = Gateway + A + B + C + network overhead x3

If each service takes 50ms and network adds 5ms per hop:
  Total = 50 + 5 + 50 + 5 + 50 + 5 = 165ms

If Service C is slow (200ms):
  Total = 50 + 5 + 50 + 5 + 200 + 5 = 315ms

The SLOWEST service determines overall latency.

Mitigation strategies:

// 1. Parallelise independent calls
const [user, products, recommendations] = await Promise.all([
  userClient.getUser(userId),
  catalogClient.getProducts(productIds),
  recommendationClient.getRecommendations(userId),
]);
// Total latency = max(user, products, recommendations) instead of sum

// 2. Use caching to avoid repeated calls
const NodeCache = require('node-cache');
const cache = new NodeCache({ stdTTL: 300 }); // 5-minute TTL

async function getUserCached(userId) {
  const cached = cache.get(`user:${userId}`);
  if (cached) return cached;

  const user = await userClient.getUser(userId);
  cache.set(`user:${userId}`, user);
  return user;
}

// 3. Set aggressive timeouts
const response = await axios.get(url, {
  timeout: 2000, // Do not wait more than 2 seconds
});

6.2 Asynchronous Latency

Client --> Service A --> Message Broker --> Service B (processes later)
  |
  +-- Gets immediate response (202 Accepted)
  |
  +-- Total perceived latency for client: ~50ms (just Service A)
  |
  +-- Service B processes 100ms later, but client does not wait

7. Coupling Implications

7.1 Synchronous Coupling

Temporal coupling: Both services must be running at the same time.

  Service A --REST--> Service B
  If B is down, A fails.

Behavioural coupling: A depends on B's interface.

  If B changes its API, A breaks.

7.2 Asynchronous Decoupling

No temporal coupling: Producer and consumer do not need to be running simultaneously.

  Service A --event--> Broker --event--> Service B
  If B is down, the broker holds the message. B processes when it comes back.

Reduced behavioural coupling: Services agree on event schemas, not API contracts.
  The producer does not know (or care) who consumes the event.

8. Choreography vs Orchestration

8.1 Choreography (Decentralised)

Each service knows what to do when it receives an event. No central coordinator.

 Order        Inventory      Payment      Notification
   |               |              |              |
   |--OrderCreated-|              |              |
   |               |              |              |
   |          InventoryReserved---|              |
   |               |              |              |
   |               |        PaymentSucceeded-----|
   |               |              |              |
   |<-----OrderConfirmed---------|              |
   |               |              |      EmailSent

Pros: Loose coupling, no single point of failure, each service is autonomous. Cons: Hard to see the full flow, can become "event spaghetti" with many services.

8.2 Orchestration (Centralised)

A central orchestrator tells each service what to do and when.

                   +------------------+
                   |  Order Saga      |
                   |  Orchestrator    |
                   +--------+---------+
                            |
          +-----------------+-----------------+
          |                 |                 |
  1. Reserve         2. Charge         3. Confirm
     Inventory          Payment           Order
          |                 |                 |
     +----+----+       +---+---+        +----+----+
     |Inventory|       |Payment|        |  Order  |
     | Service |       |Service|        | Service |
     +---------+       +-------+        +---------+

Pros: Clear flow visibility, easier to debug, handles complex workflows. Cons: Central point of failure, orchestrator can become a "god service."

8.3 When to Use Each

CriteriaChoreographyOrchestration
Number of steps2-45+
Workflow complexitySimple, linearBranching, conditional, parallel
Team structureEach team owns their eventsCentral platform team available
Observability needsLow (can trace via events)High (need to see full workflow)
Failure handlingEach service handles its ownCentralised compensation logic

9. Idempotency: The Safety Net

Messages can be delivered more than once (network retries, broker redelivery). Every consumer must be idempotent -- processing the same message twice must produce the same result.

// BAD: Not idempotent -- balance decreases every time message is processed
async function handlePayment(event) {
  await db.query('UPDATE accounts SET balance = balance - $1 WHERE user_id = $2',
    [event.amount, event.userId]);
}
// If this runs twice: user is charged twice!

// GOOD: Idempotent -- uses idempotency key
async function handlePayment(event) {
  // Check if this event was already processed
  const existing = await db.query(
    'SELECT id FROM processed_events WHERE event_id = $1',
    [event.eventId]
  );

  if (existing.rows.length > 0) {
    console.log(`Event ${event.eventId} already processed, skipping`);
    return; // Already processed -- safe to skip
  }

  const client = await db.connect();
  try {
    await client.query('BEGIN');

    await client.query(
      'UPDATE accounts SET balance = balance - $1 WHERE user_id = $2',
      [event.amount, event.userId]
    );

    await client.query(
      'INSERT INTO processed_events (event_id, processed_at) VALUES ($1, NOW())',
      [event.eventId]
    );

    await client.query('COMMIT');
  } catch (err) {
    await client.query('ROLLBACK');
    throw err;
  } finally {
    client.release();
  }
}

10. Communication Patterns at a Glance

+----------------+-------------------+-------------------+
|                | Request-Response  | Event-Driven      |
+----------------+-------------------+-------------------+
| Synchronous    | REST, gRPC        | (rare -- webhooks)|
+----------------+-------------------+-------------------+
| Asynchronous   | Async request via | Pub/sub,          |
|                | queue (RPC over   | event streaming   |
|                | message broker)   | (Kafka, RabbitMQ) |
+----------------+-------------------+-------------------+

11. Key Takeaways

  1. Default to synchronous REST for simplicity. Switch to async or gRPC only when you have a clear reason.
  2. Asynchronous communication decouples services in time -- the producer and consumer do not need to be running simultaneously.
  3. Use message queues for work distribution (one consumer per message). Use pub/sub for event notification (many consumers per event).
  4. gRPC beats REST for internal high-throughput communication but adds complexity and requires proto file management.
  5. Synchronous call chains amplify latency. Use Promise.all to parallelise independent calls and set aggressive timeouts.
  6. Circuit breakers prevent cascading failures when a downstream service is slow or down.
  7. Choreography works for simple flows; orchestration works for complex workflows. Most real systems use a mix of both.
  8. Every consumer must be idempotent. Messages will be delivered more than once. Design for it.

12. Explain-It Challenge

  1. You are building a food delivery app. When a customer places an order, the system must: validate the restaurant is open, confirm menu items are available, calculate delivery fee, charge the customer, and notify the restaurant. Design the communication pattern. Which calls are synchronous? Which are asynchronous? Why?

  2. A developer proposes using Kafka for all inter-service communication, even simple request-response queries like "get user by ID." Explain why this is overkill and what you would recommend instead.

  3. Your Order Service calls the Inventory Service, which calls the Pricing Service, which calls the Discount Service. The total latency is 800ms. Propose three concrete strategies to reduce this latency to under 200ms.


Navigation << 6.1.d Database per Service | 6.1.e Service Communication Patterns | Exercise Questions >>