Consistency models
In this series (20 parts)
- What is system design and why it matters
- Estimations and back-of-envelope calculations
- Scalability: vertical vs horizontal scaling
- CAP theorem and distributed system tradeoffs
- Consistency models
- Load balancing
- Caching: strategies and patterns
- Content Delivery Networks
- Databases: SQL vs NoSQL and when to use each
- Database replication
- Database sharding and partitioning
- Consistent hashing
- Message queues and event streaming
- API design: REST, GraphQL, gRPC
- Rate limiting and throttling
- Proxies: forward and reverse
- Networking concepts for system design
- Reliability patterns: timeouts, retries, circuit breakers
- Observability: logging, metrics, tracing
- Security in system design
Every time you write a value to a distributed system and then read it back, you are relying on a consistency model. The model defines what the system promises about the relationship between writes and reads across nodes. Pick a model that is too strong and you pay in latency. Pick one that is too weak and your users see stale or contradictory data. This article walks through the major consistency models, explains when each one makes sense, and shows how they connect to the CAP theorem you already know.
What consistency actually means
A single-node database has no consistency dilemma. Every read returns the latest write because there is exactly one copy of the data. The moment you replicate data across two or more nodes, you introduce a delay. Node A accepts a write. Node B still holds the old value until the write propagates. During that window, different clients can see different versions of the same record.
A consistency model is a contract between the system and its clients. It answers one question: after a write completes, which reads are guaranteed to see that write?
Strong consistency
Strong consistency (also called linearizability) is the simplest contract. Every read returns the most recent write, regardless of which node handles the request. The system behaves as if there is a single copy of the data, even though replicas exist behind the scenes.
How does this work in practice? The write must be acknowledged by a majority (or all) replicas before the system tells the client “success.” Reads either go to the leader or require a quorum. Google’s Spanner achieves this using synchronized clocks (TrueTime) with commit-wait intervals around 7ms. ZooKeeper uses a consensus protocol where reads go through the leader.
The cost is real. If your replicas sit in us-east-1 and eu-west-1, a round trip takes roughly 80ms. A quorum write touching both regions adds at least that much latency to every operation. For a payment ledger this is acceptable. For a social media like counter it is not.
Strong consistency is the right choice when correctness is non-negotiable: bank transfers, inventory counts, distributed locks, leader elections. If two clients must never see conflicting states, you need linearizability.
Eventual consistency
Eventual consistency sits at the opposite end of the spectrum. The system guarantees that if no new writes arrive, all replicas will converge to the same value. It says nothing about when convergence happens. The window could be 5ms or 5 seconds.
Amazon’s Dynamo paper popularized this model. When you add an item to your shopping cart, the write goes to the nearest replica and returns immediately. Other replicas receive the update asynchronously. If you read from a different replica before propagation completes, you might see the old cart.
sequenceDiagram participant C1 as Client A participant R1 as Replica 1 participant R2 as Replica 2 participant C2 as Client B C1->>R1: write(x = 42) R1-->>C1: ack Note over R1,R2: inconsistency window C2->>R2: read(x) R2-->>C2: x = 0 (stale) R1--)R2: async replication C2->>R2: read(x) R2-->>C2: x = 42 (converged)
Client B reads stale data from Replica 2 during the inconsistency window. After async replication completes, both replicas agree on x = 42.
This model enables extremely low latency. Writes return in under 1ms because they touch a single node. Availability stays high because no coordination is required. DNS is a classic example: when you update a DNS record, propagation to all nameservers can take 24 to 48 hours, and the world keeps working.
Eventual consistency works well for systems where temporary staleness is harmless: social media timelines, product view counts, recommendation caches, sensor telemetry. The pattern is always the same: losing a few seconds of freshness does not break the user experience.
Causal consistency
Causal consistency fills the gap between strong and eventual. It guarantees that operations with a cause-and-effect relationship are seen in the correct order by all nodes. Operations that are causally unrelated (concurrent) can appear in any order.
Consider a chat application. Alice posts “Anyone free for lunch?” and then Bob replies “I am.” Causal consistency ensures every user sees Alice’s message before Bob’s reply. Without it, some users might see Bob’s reply floating alone, which makes no sense.
Systems track causality using vector clocks or version vectors. Each operation carries a logical timestamp that encodes which prior operations it depends on. When a replica receives an operation, it delays delivery until all causal predecessors have been applied. COPS (Clusters of Order-Preserving Servers) from the 2011 SOSP paper demonstrated this at scale with cross-datacenter replication under 1ms local latency.
Causal consistency costs less than strong consistency because concurrent writes skip coordination entirely. It costs more than eventual consistency because the system must track and enforce ordering. For collaborative editing tools, social network feeds, and messaging systems, the trade-off is usually worthwhile.
Read-your-writes consistency
Read-your-writes (also called read-my-writes) is a session-level guarantee. After a client performs a write, that same client is guaranteed to see the write on all subsequent reads. Other clients might still see stale data.
This model solves one of the most frustrating user experiences in distributed systems. You update your profile picture, refresh the page, and your old picture stares back at you. With read-your-writes, the system ensures your session always reflects your own changes.
Implementation is straightforward. The system can route all reads for a session to the same replica that handled the write (sticky sessions). Alternatively, the client can send a logical timestamp with each read, and the replica waits until it has caught up to that timestamp before responding.
Read-your-writes is weaker than causal consistency. It only guarantees you see your own writes, not the causal ordering of everyone’s writes. But it is cheap and solves the most visible consistency bug users encounter. Most web applications need at least this guarantee.
Monotonic reads
Monotonic reads guarantee that once a client sees a particular version of the data, it never sees an older version on subsequent reads. Time does not appear to move backward for any individual client.
Without this guarantee, bizarre things happen. You load a dashboard showing 150 orders. You refresh and see 140 orders. You refresh again and see 155. The second read went to a replica that was behind the first. Monotonic reads prevent this by ensuring each client’s reads come from progressively up-to-date snapshots.
The implementation often overlaps with read-your-writes. Sticky sessions naturally provide both guarantees. If sticky sessions are not possible, the client can track the latest version it observed and reject responses from replicas that have not reached that version.
Why weaker models exist
You might wonder why anyone would choose eventual consistency when strong consistency “just works.” The answer comes down to physics and the CAP theorem.
Light travels through fiber at roughly 200,000 km/s. A round trip from New York to London (11,000 km) takes about 55ms. No engineering can eliminate that delay. Strong consistency requires coordination across replicas, which means waiting for those round trips. In a three-datacenter setup spanning the US and Europe, a strongly consistent write needs two round trips to a majority: approximately 110ms best case.
Eventual consistency needs zero round trips beyond the local replica: under 1ms. That is a 100x difference. At 10,000 writes per second, strong consistency adds 1,100 seconds of cumulative wait time every second. Eventual consistency adds 10 seconds.
stateDiagram-v2 [*] --> Diverged: write arrives at one replica Diverged --> Propagating: async replication starts Propagating --> Converging: replicas exchange updates Converging --> Consistent: all replicas agree Consistent --> Diverged: new write arrives Consistent --> [*]: no more writes
Replica convergence cycle in an eventually consistent system. The system spends most of its time oscillating between Diverged and Consistent states.
Database replication strategies determine how fast replicas move through this cycle. Synchronous replication holds the write until all replicas confirm, giving strong consistency at the cost of availability. Asynchronous replication lets the write return immediately, giving eventual consistency with high availability.
The trade-off extends beyond latency. Stronger models reduce the number of concurrent requests the system can handle because coordination creates contention. Weaker models let replicas operate independently, scaling throughput linearly with the number of nodes.
Choosing the right model
There is no universally correct consistency model. The choice depends on what breaks when clients see stale data.
Financial transactions demand strong consistency. If two ATMs process withdrawals concurrently against the same account, the system must serialize them. A brief period of unavailability is better than overdrawing an account. Use consensus protocols like Raft or Paxos, or a database like Spanner or CockroachDB.
User-facing content like posts, comments, and profiles usually needs read-your-writes at minimum. Users tolerate other people’s updates arriving a few seconds late, but they cannot tolerate their own changes vanishing. Combine sticky sessions with asynchronous replication.
Messaging and collaboration benefits from causal consistency. The ordering of causally related events (a reply after its parent message) matters for coherence. Systems like MongoDB (with causal sessions) and some custom protocols provide this.
Analytics and caching can run on eventual consistency. A dashboard showing page views from 5 seconds ago is fine. A CDN serving a slightly stale product page is fine. The simplicity and performance gains are worth the staleness. Check databases overview for how different storage engines handle these guarantees at the query level.
Hybrid approaches are common in production systems. A single application might use strong consistency for the payments table, causal consistency for the messaging service, and eventual consistency for the recommendation engine. Each component picks the weakest model it can tolerate, minimizing coordination overhead where possible.
Tunable consistency
Some systems let you choose consistency per operation rather than per cluster. Cassandra is the canonical example. You configure a replication factor (say RF = 3) and then set consistency levels per query:
- ONE: read or write touches a single replica. Fastest, weakest guarantee.
- QUORUM: read or write touches a majority (2 of 3). Provides strong consistency when both reads and writes use QUORUM, because any two majorities overlap.
- ALL: every replica must respond. Strongest guarantee, lowest availability.
The formula is simple. If (where is write replicas, is read replicas, is total replicas), you get strong consistency for that operation. With RF = 3, QUORUM writes () plus QUORUM reads () give . Strong consistency. ONE write () plus ONE read () gives . Eventual consistency.
This flexibility lets you use strong consistency only where it matters, keeping the rest of your queries fast.
Conflict resolution
Weaker consistency models allow concurrent writes to the same key. When replicas converge, they must resolve conflicts. Common strategies include:
Last-writer-wins (LWW) picks the write with the highest timestamp. Simple but lossy. If two users edit the same document concurrently, one edit silently disappears. Cassandra uses LWW by default.
Merge functions combine concurrent writes deterministically. CRDTs (Conflict-free Replicated Data Types) are the formal framework here. A G-Counter, for example, lets each replica increment its own slot, and the merged value is the sum of all slots. No writes are lost.
Application-level resolution pushes the conflict to the client. Amazon’s Dynamo returns all conflicting versions (siblings) and lets the application decide. The shopping cart merges items from both versions, which is why you sometimes see duplicates in your cart.
The choice of conflict resolution strategy is just as important as the consistency model itself. Strong consistency avoids conflicts entirely. Eventual consistency must handle them gracefully.
What comes next
Consistency models dictate what guarantees your replicas provide. But before any request reaches a replica, something must decide which server handles it. That is the job of a load balancer. In the next article on load balancing, we will look at algorithms like round-robin, least connections, and consistent hashing, and how they interact with session affinity and the consistency guarantees we covered here.