Episode 9 — System Design / 9.8 — Communication and Data Layer

9.8.e — Data Consistency

Big Picture: In a single-server world, consistency is easy — there is one copy of the data and every read sees the latest write. In distributed systems, data lives on multiple nodes, and keeping them in sync is one of the hardest problems in computer science. Every system design interview requires you to reason about consistency tradeoffs.

What Is Data Consistency?
Strong vs Eventual Consistency
Consistency Levels
Consistency in Read Replicas
Distributed Transactions
Two-Phase Commit (2PC)
Saga Pattern
CRDTs (Conflict-Free Replicated Data Types)
Real-World Tradeoffs
Key Takeaways
Explain-It Challenge

What Is Data Consistency?

Consistency = all nodes in a distributed system agree on the current value of data.

  CONSISTENT STATE                        INCONSISTENT STATE
  ================                        ==================

  User writes: balance = $500            User writes: balance = $500

  Node A: balance = $500                 Node A: balance = $500
  Node B: balance = $500                 Node B: balance = $300  <-- stale!
  Node C: balance = $500                 Node C: balance = $500
  
  All nodes agree.                       Node B has not received the update yet.
  Any read returns $500.                 A read to Node B returns wrong data.

Strong vs Eventual Consistency

Strong Consistency

Every read returns the most recent write. The system behaves as if there is a single copy of the data.

  Timeline (Strong Consistency):
  ==============================
  
  t=0   Client A: WRITE balance = $500 to Primary
  t=1   Primary: replicate to ALL replicas (wait for ACK)
  t=2   All replicas: ACK received
  t=3   Primary: returns success to Client A
  t=4   Client B: READ from ANY node --> always returns $500
  
  Guarantee: After a write completes, all subsequent reads
  see that write, no matter which node they read from.

Eventual Consistency

If no new writes occur, all nodes will eventually converge to the same value. The window of inconsistency is usually milliseconds to seconds.

  Timeline (Eventual Consistency):
  ================================
  
  t=0   Client A: WRITE balance = $500 to Primary
  t=1   Primary: returns success to Client A immediately
  t=2   Primary: starts replicating to replicas (async)
  t=3   Client B: READ from Replica 1 --> returns $300 (stale!)
  t=4   Replica 1: receives update, now has $500
  t=5   Client B: READ from Replica 1 --> returns $500 (correct)
  
  Guarantee: Data will converge, but reads during replication
  may return stale data.

Comparison

Property	Strong Consistency	Eventual Consistency
Read guarantee	Always latest value	May be stale
Write latency	Higher (must wait for replicas)	Lower (return immediately)
Availability	Lower (fails if replicas unreachable)	Higher (works even if some nodes down)
Throughput	Lower	Higher
Complexity	Lower (simpler mental model)	Higher (handle stale reads)
Examples	Bank balances, inventory counts	Social media likes, view counts

Consistency Levels

Many databases offer tunable consistency, letting you choose per-query.

Cassandra Consistency Levels

  WRITE Consistency Levels:
  =========================
  
  Replication Factor = 3 (data on 3 nodes)
  
  ONE:     Write to 1 node, return success.
           +-----+  +-----+  +-----+
           |[ACK]|  |     |  |     |    --> Fastest, least safe
           +-----+  +-----+  +-----+
  
  QUORUM:  Write to majority (2 of 3), return success.
           +-----+  +-----+  +-----+
           |[ACK]|  |[ACK]|  |     |    --> Balanced
           +-----+  +-----+  +-----+
  
  ALL:     Write to all 3 nodes, return success.
           +-----+  +-----+  +-----+
           |[ACK]|  |[ACK]|  |[ACK]|    --> Slowest, safest
           +-----+  +-----+  +-----+

The Quorum Formula

  For strong consistency:

  R + W > N

  Where:
    N = number of replicas
    W = number of nodes that must ACK a write
    R = number of nodes that must respond to a read
  
  Example (N=3):
  - W=2, R=2: 2+2=4 > 3 --> STRONG (at least one node has latest)
  - W=1, R=1: 1+1=2 < 3 --> EVENTUAL (might read stale data)
  - W=3, R=1: 3+1=4 > 3 --> STRONG (all nodes always current)
  - W=1, R=3: 1+3=4 > 3 --> STRONG (read all, one must be current)

DynamoDB Consistency Options

Read Type	Behavior	Cost	Latency
Eventually Consistent	May return stale data	0.5 RCU per 4KB	Lower
Strongly Consistent	Always returns latest	1 RCU per 4KB	Higher

Interview tip: Always specify your consistency requirements when discussing a system. "We need strong consistency for payment data but can tolerate eventual consistency for the activity feed."

Consistency in Read Replicas

The Problem

  User Alice: Creates a post
  
  t=0  [Primary DB] <-- POST /posts {text: "Hello!"}
       Primary writes and returns 201 Created
       
  t=1  Alice: GET /posts (my posts)
       Load balancer sends to Replica 2
       
       [Replica 2] has not received the replication yet!
       Returns: [] (empty list)
       
       Alice: "Where is my post?! The app is broken!"

Solutions

Pattern	How It Works	Tradeoff
Read-your-writes	After write, route that user's reads to primary for N seconds	Adds primary load
Sticky sessions	Pin user to a specific replica	Uneven load distribution
Causal consistency	Track write timestamps, ensure reads see at least that timestamp	Complex implementation
Synchronous replication	Primary waits for replicas before returning	Slower writes
Client-side merge	Client includes write in local state until confirmed	Complex client logic

Read-Your-Writes Implementation

  Request Flow:
  =============

  1. Client writes: POST /posts
     Server returns: { "id": 42, "write_token": "ts:1705334400" }

  2. Client reads: GET /posts
     Header: X-Write-Token: ts:1705334400

  3. Load balancer checks:
     - Replica's last-applied timestamp >= 1705334400?
     - YES: Route to replica (safe)
     - NO: Route to primary (ensure consistency)

Distributed Transactions

In a microservice architecture, a single business operation may span multiple services and databases. How do you ensure atomicity?

  E-Commerce Order Placement:
  ===========================
  
  1. Order Service:     Create order record
  2. Payment Service:   Charge credit card
  3. Inventory Service: Reserve items
  4. Notification:      Send confirmation email
  
  What if step 3 fails after step 2 succeeded?
  Customer was charged but items were not reserved!

Approaches

Approach	How	Pros	Cons
2PC	Coordinator asks all to prepare, then commit	Strong consistency	Blocking, single point of failure
Saga	Chain of local transactions with compensations	Non-blocking, resilient	Eventual consistency, complex rollback
Outbox	Write to DB + outbox table, publish events	Reliable event publishing	Extra table, polling/CDC needed

Two-Phase Commit (2PC)

A protocol for coordinating a transaction across multiple databases/services.

  PHASE 1: PREPARE (Voting)
  =========================
  
  Coordinator              Participant A     Participant B     Participant C
       |                        |                 |                 |
       |--- PREPARE ----------->|                 |                 |
       |--- PREPARE --------------------------->  |                 |
       |--- PREPARE --------------------------------------------->  |
       |                        |                 |                 |
       |<-- VOTE YES -----------|                 |                 |
       |<-- VOTE YES ----------------------------|                 |
       |<-- VOTE YES ----------------------------------------------|
       |                        |                 |                 |
  
  All voted YES? Proceed to Phase 2.
  Any voted NO? ABORT all.

  PHASE 2: COMMIT
  ================
  
  Coordinator              Participant A     Participant B     Participant C
       |                        |                 |                 |
       |--- COMMIT ------------>|                 |                 |
       |--- COMMIT ------------------------------>|                 |
       |--- COMMIT ------------------------------------------------>|
       |                        |                 |                 |
       |<-- ACK ----------------|                 |                 |
       |<-- ACK ----------------------------------|                 |
       |<-- ACK ---------------------------------------------------|
       |                        |                 |                 |
  Transaction complete!

2PC Problems

Problem	Description
Blocking	If coordinator crashes after PREPARE but before COMMIT, participants are stuck holding locks
Latency	2 round trips minimum, participants hold locks throughout
Single point of failure	Coordinator is a bottleneck
Not partition tolerant	Cannot proceed if coordinator is partitioned from participants
Reduced throughput	Locks are held across the entire distributed transaction

Interview insight: 2PC is used within a single database system (e.g., PostgreSQL with foreign data wrappers) but is generally avoided in microservice architectures due to its blocking nature.

Saga Pattern

A saga is a sequence of local transactions. Each step has a compensating transaction that undoes its effect if a later step fails.

Choreography-Based Saga

Each service listens for events and reacts independently.

  Order Service       Payment Service     Inventory Service    Notification
       |                    |                    |                   |
       |-- OrderCreated --->|                    |                   |
       |                    |                    |                   |
       |                    |-- PaymentCharged ->|                   |
       |                    |                    |                   |
       |                    |                    |-- ItemsReserved ->|
       |                    |                    |                   |
       |                    |                    |               SendEmail
       |                    |                    |                   |
  
  IF Inventory fails:
       |                    |                    |                   |
       |                    |<- ReserveFailed ---|                   |
       |                    |                    |                   |
       |                    |-- RefundPayment -->|                   |
       |<-- OrderCancelled -|                    |                   |

Orchestration-Based Saga

A central orchestrator coordinates the steps.

  +---------------------+
  |   Saga Orchestrator  |
  |   (Order Saga)       |
  +----------+----------+
             |
    Step 1:  |--- Create Order --------> Order Service
             |<-- Order Created ---------|
             |
    Step 2:  |--- Charge Payment ------> Payment Service
             |<-- Payment Charged -------|
             |
    Step 3:  |--- Reserve Items -------> Inventory Service
             |<-- FAILED! Items OOS ----|
             |
    Compensate:
    Step 2c: |--- Refund Payment ------> Payment Service
             |<-- Refund Complete -------|
             |
    Step 1c: |--- Cancel Order --------> Order Service
             |<-- Order Cancelled -------|

Choreography vs Orchestration

Factor	Choreography	Orchestration
Coupling	Loose (event-driven)	Tighter (orchestrator knows all steps)
Complexity	Distributed across services	Centralized in orchestrator
Debugging	Harder (trace events across services)	Easier (single flow to follow)
Single point of failure	None	Orchestrator
Adding steps	Publish new event, add listener	Modify orchestrator
Best for	Simple sagas (2-3 steps)	Complex sagas (5+ steps, branching)

Saga with Compensating Transactions

Step	Action	Compensation
1. Create order	`order.status = 'pending'`	`order.status = 'cancelled'`
2. Reserve inventory	`item.reserved += qty`	`item.reserved -= qty`
3. Charge payment	`payment.charge(amount)`	`payment.refund(amount)`
4. Confirm order	`order.status = 'confirmed'`	(no compensation needed, last step)
5. Send notification	`email.send(confirmation)`	`email.send(cancellation)`

Saga State Machine

  +----------+     +----------+     +----------+     +-----------+
  |  Order   | --> | Payment  | --> | Inventory| --> | Completed |
  |  Created |     | Charged  |     | Reserved |     |           |
  +----+-----+     +----+-----+     +----+-----+     +-----------+
       |                |                |
       v                v                v
  +---------+     +-----------+     +-----------+
  |  Order  |     | Payment   |     | Inventory |
  | Cancel  | <-- | Refunded  | <-- | Release   |
  +---------+     +-----------+     +-----------+
       |
       v
  +---------+
  |  Failed |
  +---------+

CRDTs (Conflict-Free Replicated Data Types)

CRDTs are data structures that can be replicated across nodes, updated independently, and always converge to a consistent state without coordination.

The Problem CRDTs Solve

  Traditional approach:
  =====================
  Node A: counter = 5, increment -> counter = 6
  Node B: counter = 5, increment -> counter = 6
  
  Merge: counter = 6? or 7? CONFLICT!
  
  CRDT approach (G-Counter):
  ==========================
  Node A: {A: 3, B: 2} -> increment -> {A: 4, B: 2}
  Node B: {A: 3, B: 2} -> increment -> {A: 3, B: 3}
  
  Merge: {A: max(4,3), B: max(2,3)} = {A: 4, B: 3} = total 7
  No conflict! Always correct!

Common CRDT Types

CRDT	What It Models	Operations	Example Use
G-Counter	Grow-only counter	Increment	View counts, like counts
PN-Counter	Counter (up and down)	Increment, Decrement	Inventory (careful), votes
G-Set	Grow-only set	Add	Tags, visited pages
OR-Set	Set (add and remove)	Add, Remove	Shopping cart items
LWW-Register	Last-Writer-Wins	Set	User profile fields
LWW-Map	Map with LWW per key	Set, Delete	Collaborative document fields

Shopping Cart as OR-Set CRDT

  User's cart is replicated on 2 nodes:
  
  Node A (US):                    Node B (EU):
  Cart: {laptop, mouse}          Cart: {laptop, mouse}
  
  User adds "keyboard" (via US):  User removes "mouse" (via EU):
  Cart: {laptop, mouse, keyboard} Cart: {laptop}
  
  Merge (OR-Set rules):
  - "keyboard" was added -> include it
  - "mouse" was removed -> exclude it
  - Result: {laptop, keyboard}
  
  Both nodes converge to {laptop, keyboard}. No conflicts.

Interview note: CRDTs are used by Riak, Redis (CRDTs in Redis Enterprise), Apple (Notes app sync), Figma (collaborative editing). You do not need to implement them, but knowing they exist and what they solve is valuable.

Real-World Tradeoffs

Case Study: Consistency Choices at Scale

Company	System	Consistency Choice	Why
Amazon	Shopping cart	Eventual (AP)	Cart should always be available; merge conflicts later
Amazon	Order placement	Strong (CP)	Cannot charge twice or lose an order
Twitter	Tweet timeline	Eventual	Seeing a tweet 2 seconds late is acceptable
Twitter	Direct messages	Strong	Losing a message is not acceptable
Uber	Ride matching	Strong	Cannot assign same driver to two riders
Uber	Trip history	Eventual	Viewing yesterday's trip a second late is fine
Netflix	View count	Eventual	Approximate counts are fine for recommendations
Stripe	Payment processing	Strong	Financial transactions demand exactness

The Consistency Spectrum

  STRONG                                                          EVENTUAL
  <=================================================================>
  |              |              |              |              |
  Linearizable   Sequential    Causal         Session       Eventual
  
  Every read     Operations    Causally       Per-session    Will converge
  sees latest    appear in     related ops    guarantees     eventually
  write          some total    are ordered    (read-your-
                 order                        writes)
  
  Bank           Distributed   Social media   Shopping       View counts
  balances       locks         comments       cart           Analytics
                               (replies after 
                                parent)

Practical Guidelines

Data Type	Consistency Needed	Why
Financial transactions	Strong	Money must be exact
Inventory counts	Strong	Over-selling is costly
User authentication	Strong	Security cannot be eventual
Social media likes	Eventual	142 vs 143 likes does not matter
Search indexes	Eventual	Slight delay in searchability is fine
Analytics/metrics	Eventual	Aggregates smooth out inconsistencies
Configuration/feature flags	Strong	Incorrect config causes outages
User profiles	Session (read-your-writes)	User should see their own changes

Key Takeaways

Strong consistency is expensive — It requires coordination between nodes, adding latency and reducing availability. Use it only when correctness demands it.
Eventual consistency is not "broken" — It is a deliberate choice that enables higher availability and performance. Most user-facing data can tolerate brief inconsistency.
The quorum formula (R + W > N) is your tool for tuning consistency — Memorize it for interviews.
2PC is a coordination protocol, not a microservice pattern — It blocks, does not scale, and should be avoided across service boundaries.
Sagas are the standard for distributed transactions in microservices — Choreography for simple flows, orchestration for complex ones.
CRDTs eliminate conflicts by design — They are the most elegant solution for concurrent updates across replicas, but they only work for specific data types.
Different data in the same system needs different consistency — Payment data needs strong consistency while activity feeds can be eventual. Design per use case, not per system.
Read-your-writes consistency solves 80% of user-facing issues — Users expect to see their own changes immediately, even if other users see them with a delay.

Explain-It Challenge

Scenario: You are the tech lead for a ride-sharing app (like Uber). Your system must handle:

Rider requests a ride — Must not double-book a driver

Payment processing — Must not charge twice

Driver location updates — Sent every 3 seconds from the driver's phone

Trip history — Riders can view past trips

Surge pricing — Prices adjust based on demand

For each feature, specify:

What consistency level you would use and why

Whether you would use a saga, 2PC, or neither

What happens if a network partition occurs during this operation

Back to 9.8 README

Cohart