Episode 6 — Scaling Reliability Microservices Web3 / 6.2 — Building and Orchestrating Microservices

6.2.e — Event Payloads & Idempotency

In one sentence: Well-designed event payloads include a unique ID, timestamp, source, and versioned schema, and every consumer must be idempotent — processing the same event twice should produce the same result as processing it once.

Navigation: <- 6.2.d Event-Driven Architecture | 6.2 Exercise Questions ->


1. Designing Event Payloads

The anatomy of a well-designed event

{
  "type": "order.placed",
  "data": {
    "orderId": "ord_12345",
    "userId": "usr_67890",
    "items": [
      { "productId": "prod_111", "quantity": 2, "priceAtOrder": 29.99 }
    ],
    "totalAmount": 59.98,
    "currency": "USD",
    "shippingAddress": {
      "city": "San Francisco",
      "state": "CA",
      "zip": "94102"
    }
  },
  "metadata": {
    "eventId": "evt_abc123def456",
    "timestamp": "2026-04-11T14:30:00.000Z",
    "source": "order-service",
    "version": "2.0",
    "correlationId": "req_xyz789",
    "causationId": "evt_previous_event_id"
  }
}

Required fields explained

FieldPurposeWhy It Matters
typeEvent name (noun.verb past tense)Consumers route/filter by type
dataThe business payloadWhat happened
metadata.eventIdGlobally unique identifierDeduplication, idempotency
metadata.timestampWhen the event occurredOrdering, debugging, auditing
metadata.sourceWhich service produced itDebugging, filtering
metadata.versionSchema versionBackward-compatible evolution
metadata.correlationIdLinks related events across servicesDistributed tracing
metadata.causationIdThe event that caused this eventEvent chain tracking

2. Event Naming Conventions

Pattern: <entity>.<action-past-tense>

GOOD naming:
  order.placed           (an order was placed)
  order.shipped          (an order was shipped)
  order.cancelled        (an order was cancelled)
  user.created           (a user was created)
  user.email_verified    (a user's email was verified)
  payment.completed      (a payment was completed)
  payment.failed         (a payment failed)
  inventory.reserved     (inventory was reserved)
  inventory.released     (inventory was released)

BAD naming:
  createOrder            (imperative — sounds like a command, not an event)
  ORDER_CREATED          (inconsistent casing)
  new-order              (vague, no entity.action pattern)
  sendEmail              (command, not event — the email hasn't been sent yet)

Key distinction: Events describe something that happened (past tense). Commands describe something you want to happen (imperative).

Event:   "order.placed"   → "Hey everyone, an order was placed"
Command: "place.order"    → "Hey order-service, place this order"

3. Schema Versioning

As your system evolves, event schemas change. You need a strategy to handle this without breaking consumers.

Strategy 1: Additive changes only (recommended)

// Version 1.0
{
  "type": "order.placed",
  "data": {
    "orderId": "ord_123",
    "userId": "usr_456",
    "totalAmount": 59.98
  },
  "metadata": { "version": "1.0" }
}

// Version 1.1 — ADDED fields (backward compatible)
{
  "type": "order.placed",
  "data": {
    "orderId": "ord_123",
    "userId": "usr_456",
    "totalAmount": 59.98,
    "currency": "USD",          // NEW — old consumers ignore it
    "discountApplied": 5.00     // NEW — old consumers ignore it
  },
  "metadata": { "version": "1.1" }
}

Rule: Only ADD fields. Never remove or rename existing fields. Old consumers that don't know about new fields simply ignore them.

Strategy 2: Version-specific handling

// Consumer handles multiple versions
async function handleOrderPlaced(event) {
  const version = event.metadata.version;

  switch (version) {
    case '1.0':
      // Original format
      await processOrderV1(event.data);
      break;
    case '2.0':
      // New format with additional fields
      await processOrderV2(event.data);
      break;
    default:
      console.warn(`Unknown version ${version} for order.placed. Attempting v2 handler.`);
      await processOrderV2(event.data);
  }
}

Strategy 3: Schema registry

For larger systems, use a schema registry (like Confluent Schema Registry with Kafka) that validates event schemas at publish time and ensures compatibility.


4. Why Events Can Be Delivered Multiple Times

In distributed systems, at-least-once delivery is the default guarantee. Messages can be delivered more than once because of:

Scenario 1: Consumer processes, then crashes before ack
  Queue → deliver message → Consumer processes → CRASH (before ack)
  Queue → "no ack received, redeliver" → deliver same message again

Scenario 2: Network issue during ack
  Queue → deliver → Consumer processes → sends ack → NETWORK ERROR
  Queue → "no ack received" → redeliver

Scenario 3: Publisher retries after timeout
  Publisher → publish → TIMEOUT (no confirmation)
  Publisher → publish again (same event)
  Queue now has TWO copies of the same event

Result: Your consumer WILL see duplicate messages. Plan for it.
Delivery guarantees:
  At-most-once:  Message might be lost, never duplicated
                 (ack before processing — fast but lossy)

  At-least-once: Message never lost, might be duplicated    ← RabbitMQ default
                 (ack after processing — safe but needs idempotency)

  Exactly-once:  Message never lost, never duplicated
                 (extremely hard — requires distributed transactions)
                 (Kafka achieves this with idempotent producers + transactions)

5. Ensuring Idempotency

Idempotent means: processing the same event 1 time or 100 times produces the same result.

5.1 Idempotency Key (Event ID)

Store the event ID of every processed event. Before processing, check if you have already seen it.

// services/notification-service/src/idempotency.js

// In production, use Redis or a database instead of in-memory
const processedEvents = new Set();

function isAlreadyProcessed(eventId) {
  return processedEvents.has(eventId);
}

function markAsProcessed(eventId) {
  processedEvents.add(eventId);
}

module.exports = { isAlreadyProcessed, markAsProcessed };
// Using idempotency in a consumer handler
const { isAlreadyProcessed, markAsProcessed } = require('./idempotency');

async function handleOrderPlaced(event) {
  const eventId = event.metadata.eventId;

  // Step 1: Check if already processed
  if (isAlreadyProcessed(eventId)) {
    console.log(`[idempotency] Skipping duplicate event: ${eventId}`);
    return; // Skip — already handled
  }

  // Step 2: Process the event
  await sendOrderConfirmationEmail(event.data);

  // Step 3: Mark as processed AFTER successful processing
  markAsProcessed(eventId);
  console.log(`[idempotency] Processed event: ${eventId}`);
}

5.2 Production Idempotency with Redis

// shared/utils/idempotency-redis.js
const Redis = require('ioredis');
const redis = new Redis(process.env.REDIS_URL || 'redis://localhost:6379');

async function isAlreadyProcessed(eventId) {
  const exists = await redis.exists(`processed:${eventId}`);
  return exists === 1;
}

async function markAsProcessed(eventId, ttlSeconds = 86400) {
  // Store with TTL — no need to keep forever
  // 24 hours is usually enough to prevent duplicates
  await redis.set(`processed:${eventId}`, '1', 'EX', ttlSeconds);
}

module.exports = { isAlreadyProcessed, markAsProcessed };

5.3 Database-Level Idempotency

For critical operations, use database constraints:

// Using unique constraints for idempotency
async function processPayment(event) {
  const { orderId, amount } = event.data;
  const eventId = event.metadata.eventId;

  try {
    // The UNIQUE constraint on event_id prevents duplicate processing
    await db.query(
      `INSERT INTO payments (order_id, amount, event_id, processed_at)
       VALUES ($1, $2, $3, NOW())`,
      [orderId, amount, eventId]
    );

    // If we get here, this is the first time processing this event
    await chargePaymentProvider(orderId, amount);

  } catch (err) {
    if (err.code === '23505') {
      // Unique violation — already processed
      console.log(`Payment for event ${eventId} already processed. Skipping.`);
      return;
    }
    throw err; // Real error — rethrow
  }
}

6. Deduplication Strategies

StrategyHow It WorksProsCons
Event ID check (Redis)Store processed IDs in Redis with TTLFast, simpleRequires Redis; TTL might expire
Database unique constraintUnique index on event_id columnStrongest guaranteeRequires DB; adds a column
Upsert / ON CONFLICTInsert or update on conflictNaturally idempotentOnly works for create/update ops
Conditional updateUPDATE ... WHERE status = 'pending'No extra storageOnly works for state-machine patterns
Idempotency windowOnly deduplicate within a time windowBounded storageEvents outside window can duplicate

Upsert pattern

// Naturally idempotent — inserting the same order twice has the same result
async function handleOrderPlaced(event) {
  const { orderId, userId, totalAmount } = event.data;

  await db.query(
    `INSERT INTO orders (id, user_id, total_amount, status, created_at)
     VALUES ($1, $2, $3, 'placed', NOW())
     ON CONFLICT (id) DO NOTHING`,  // If orderId already exists, skip
    [orderId, userId, totalAmount]
  );
}

Conditional state update pattern

// Only transition from 'placed' to 'shipped' — safe to call multiple times
async function handleOrderShipped(event) {
  const { orderId, trackingNumber } = event.data;

  const result = await db.query(
    `UPDATE orders
     SET status = 'shipped', tracking_number = $2, shipped_at = NOW()
     WHERE id = $1 AND status = 'placed'`,  // Only update if currently 'placed'
    [orderId, trackingNumber]
  );

  if (result.rowCount === 0) {
    // Either already shipped (idempotent) or order doesn't exist
    console.log(`Order ${orderId} not in 'placed' state. Skipping.`);
  }
}

7. Eventual Consistency

In a microservices system with events, data is eventually consistent — not immediately consistent.

IMMEDIATE CONSISTENCY (monolith):
  User places order → database updated → notification sent → ALL IN ONE TRANSACTION
  At any point in time, all data is consistent

EVENTUAL CONSISTENCY (microservices):
  t=0ms:  Order Service saves order (status: placed)
  t=5ms:  Event published: "order.placed"
  t=50ms: Notification Service receives event, sends email
  t=80ms: Analytics Service receives event, updates dashboard
  t=200ms: Inventory Service receives event, reserves stock

  Between t=0ms and t=200ms, the system is INCONSISTENT:
    - Order exists but inventory not reserved
    - Order exists but email not sent
    - This is OKAY — it becomes consistent eventually

When eventual consistency is acceptable

ScenarioAcceptable?Why
Sending order confirmation emailYesDelay of a few seconds is fine
Updating analytics dashboardYesNear-real-time is good enough
Reserving inventoryDependsShort delay OK; too long = overselling
Deducting paymentNoUse synchronous call — must succeed before order confirmation

8. Handling Out-of-Order Events

Events may arrive in a different order than they occurred, especially when services process at different speeds.

Published in order:
  1. order.placed     (t=0ms)
  2. order.payment_received  (t=100ms)
  3. order.shipped    (t=500ms)

Received by analytics service:
  1. order.placed     (t=10ms)   ← arrived first ✓
  3. order.shipped    (t=60ms)   ← arrived BEFORE payment! ✗
  2. order.payment_received  (t=120ms)  ← arrived last

Solutions

Solution 1: Timestamp-based ordering

async function handleEvent(event) {
  const { orderId } = event.data;
  const eventTimestamp = new Date(event.metadata.timestamp);

  // Get the last processed event timestamp for this order
  const lastProcessed = await db.query(
    'SELECT last_event_timestamp FROM orders WHERE id = $1',
    [orderId]
  );

  if (lastProcessed.rows[0]?.last_event_timestamp > eventTimestamp) {
    console.log(`Ignoring out-of-order event for order ${orderId}`);
    return; // Skip stale event
  }

  // Process and update the timestamp
  await processEvent(event);
  await db.query(
    'UPDATE orders SET last_event_timestamp = $1 WHERE id = $2',
    [eventTimestamp, orderId]
  );
}

Solution 2: State machine validation

const validTransitions = {
  'placed': ['payment_received', 'cancelled'],
  'payment_received': ['shipped', 'cancelled'],
  'shipped': ['delivered'],
  'delivered': [],   // Terminal state
  'cancelled': [],   // Terminal state
};

async function handleOrderEvent(event) {
  const { orderId } = event.data;
  const newStatus = event.type.split('.')[1]; // "order.shipped" → "shipped"

  const order = await db.query('SELECT status FROM orders WHERE id = $1', [orderId]);
  const currentStatus = order.rows[0]?.status;

  if (!validTransitions[currentStatus]?.includes(newStatus)) {
    console.log(
      `Invalid transition: ${currentStatus}${newStatus} for order ${orderId}. ` +
      `Queuing for later processing.`
    );
    // Option: requeue with delay, or store in a "pending transitions" table
    return;
  }

  await db.query('UPDATE orders SET status = $1 WHERE id = $2', [newStatus, orderId]);
}

9. Event Sourcing Concepts

Event sourcing is an architecture where you store every event that happened instead of just the current state.

TRADITIONAL (store current state):
  orders table:
    id=123, status=shipped, total=59.98, updated_at=...
    (History is lost — you can't see it was "placed" then "paid" then "shipped")

EVENT SOURCING (store all events):
  events table:
    event_id=1, order_id=123, type=order.placed,           data={total: 59.98}
    event_id=2, order_id=123, type=order.payment_received,  data={amount: 59.98}
    event_id=3, order_id=123, type=order.shipped,           data={tracking: "UPS123"}

  Current state = replay all events for order 123:
    placed → payment_received → shipped = current status is "shipped"
// Simple event store
class EventStore {
  constructor() {
    this.events = [];
  }

  append(aggregateId, event) {
    this.events.push({
      ...event,
      aggregateId,
      sequence: this.events.filter((e) => e.aggregateId === aggregateId).length + 1,
      storedAt: new Date().toISOString(),
    });
  }

  getEvents(aggregateId) {
    return this.events
      .filter((e) => e.aggregateId === aggregateId)
      .sort((a, b) => a.sequence - b.sequence);
  }

  // Rebuild current state by replaying events
  getState(aggregateId) {
    const events = this.getEvents(aggregateId);
    return events.reduce((state, event) => {
      switch (event.type) {
        case 'order.placed':
          return { ...state, ...event.data, status: 'placed' };
        case 'order.payment_received':
          return { ...state, status: 'paid', paidAmount: event.data.amount };
        case 'order.shipped':
          return { ...state, status: 'shipped', tracking: event.data.tracking };
        case 'order.cancelled':
          return { ...state, status: 'cancelled', reason: event.data.reason };
        default:
          return state;
      }
    }, {});
  }
}

// Usage
const store = new EventStore();
store.append('ord_123', { type: 'order.placed', data: { total: 59.98, userId: 'usr_1' } });
store.append('ord_123', { type: 'order.payment_received', data: { amount: 59.98 } });
store.append('ord_123', { type: 'order.shipped', data: { tracking: 'UPS123' } });

console.log(store.getState('ord_123'));
// { total: 59.98, userId: 'usr_1', status: 'shipped', paidAmount: 59.98, tracking: 'UPS123' }

console.log(store.getEvents('ord_123'));
// All 3 events — full audit trail

When to use event sourcing

Use Event SourcingDon't Use Event Sourcing
Audit trail required (finance, healthcare)Simple CRUD applications
Need to replay/rebuild stateLow event volume
Complex business processesTeam unfamiliar with the pattern
Domain events are natural (orders, workflows)Read-heavy, write-light systems

10. Practical Example: Order Processing Pipeline

Putting it all together — a complete event-driven order processing flow:

// services/order-service/src/handlers/create-order.js
const { publishEvent } = require('../../../../shared/events/publisher');
const { v4: uuidv4 } = require('uuid');

async function createOrder(req, res) {
  const { userId, items } = req.body;
  const orderId = `ord_${uuidv4().slice(0, 8)}`;
  const eventId = `evt_${uuidv4()}`;

  // 1. Calculate total
  const totalAmount = items.reduce(
    (sum, item) => sum + item.price * item.quantity,
    0
  );

  // 2. Save order to database
  const order = {
    id: orderId,
    userId,
    items,
    totalAmount,
    status: 'placed',
    createdAt: new Date().toISOString(),
  };
  await saveOrder(order); // Database insert

  // 3. Publish event AFTER successful save
  await publishEvent('platform-events', 'order.placed', {
    type: 'order.placed',
    data: {
      orderId: order.id,
      userId: order.userId,
      items: order.items,
      totalAmount: order.totalAmount,
    },
    metadata: {
      eventId,
      source: 'order-service',
      correlationId: req.headers['x-request-id'],
      version: '2.0',
    },
  });

  res.status(201).json({ data: order });
}
// services/payment-service/src/consumers/order-placed.js
const { isAlreadyProcessed, markAsProcessed } = require('../../utils/idempotency');

async function handleOrderPlaced(event) {
  const eventId = event.metadata.eventId;

  // Idempotency check
  if (await isAlreadyProcessed(eventId)) {
    console.log(`[payment] Duplicate event ${eventId}. Skipping.`);
    return;
  }

  const { orderId, userId, totalAmount } = event.data;

  try {
    // Process payment
    const paymentResult = await chargeUser(userId, totalAmount);

    // Mark idempotency AFTER successful processing
    await markAsProcessed(eventId);

    // Publish downstream event
    await publishEvent('platform-events', 'payment.completed', {
      type: 'payment.completed',
      data: {
        orderId,
        paymentId: paymentResult.id,
        amount: totalAmount,
      },
      metadata: {
        source: 'payment-service',
        correlationId: event.metadata.correlationId,
        causationId: eventId,  // This event was caused by the order.placed event
      },
    });
  } catch (err) {
    // Publish failure event
    await publishEvent('platform-events', 'payment.failed', {
      type: 'payment.failed',
      data: {
        orderId,
        reason: err.message,
      },
      metadata: {
        source: 'payment-service',
        correlationId: event.metadata.correlationId,
        causationId: eventId,
      },
    });

    throw err; // Let consumer framework handle retry/DLQ
  }
}
Complete event flow:

  order.placed ──→ payment-service ──→ payment.completed ──→ inventory-service
       │                   │                    │                     │
       │                   │                    │              inventory.reserved
       │                   │                    │                     │
       │              payment.failed            │              notification-service
       │                   │                    │                     │
       │           order.cancelled              │              "Your order is confirmed"
       │                                        │
       └──→ notification-service          shipping-service
             "Order received"              "Ready to ship"

11. Key Takeaways

  1. Every event needs an eventId, timestamp, source, and version — these are non-negotiable metadata fields.
  2. Name events as past-tense factsorder.placed, not createOrder. Events describe what happened, not what should happen.
  3. Assume every event will be delivered at least once — build idempotent consumers using event ID checks, database constraints, or upserts.
  4. Eventual consistency is normal — data across services will be temporarily inconsistent. Design for it.
  5. Handle out-of-order events — use timestamp comparison or state machine validation.
  6. Add schema versions from day one — changing event schemas without versioning breaks consumers.
  7. Correlate events using correlationId for tracing and causationId for event chains.

Explain-It Challenge

  1. The payment service processes an order.placed event, charges the customer, then crashes before acknowledging the message. The queue redelivers the message. What happens if the consumer is not idempotent? How do you prevent double-charging?
  2. Your team has 15 services and no event naming convention. Half use camelCase, half use snake_case, some use imperative verbs. Why is this a problem and what standard would you introduce?
  3. A product manager asks "Why does the order show as placed but the notification hasn't been sent yet?" Explain eventual consistency in terms they will understand.

Navigation: <- 6.2.d Event-Driven Architecture | 6.2 Exercise Questions ->