Table of contents
Open Table of contents
Introduction
When systems move from one server to many servers, failures become normal. CAP Theorem gives you the core decision model for those failures, and consistency models help you apply it feature-by-feature.
Lesson 3: CAP Theorem
The Core Idea
CAP stands for:
- Consistency (C): Every node returns the same latest value.
- Availability (A): Every request gets a response.
- Partition Tolerance (P): System continues despite network splits between nodes.
The practical theorem:
When a network partition happens, you must choose between Consistency and Availability.
Since partitions are unavoidable in distributed systems, the real choice is:
- CP: Keep data correct, allow temporary unavailability.
- AP: Keep system responsive, allow temporary stale data.
Banking Example
Two replicas: Mumbai and Delhi. Mumbai accepts a transfer, then network between regions fails.
Delhi still has old balance. Now you choose:
- Show old balance (available but possibly wrong) -> AP behavior
- Show error/unavailable until sync completes (correct but unavailable) -> CP behavior
This is a business decision, not just a technical one.
Where Each Model Fits
CP (correct data or no data):
- payments, bookings, account balance, inventory
- wrong value can cause financial or logical damage
AP (always respond, slight staleness accepted):
- social feed counters, likes, recommendations, map location
- brief staleness is acceptable for user experience
Eventual Consistency
AP systems often use eventual consistency:
Data may be stale now, but replicas converge over time.
This is why many large products stay responsive during failures without requiring strict synchronization for every read.
Decision Framework
Ask one question per feature:
What is the cost of stale or wrong data?
- High cost -> CP
- Low cost -> AP + eventual consistency
Exercise: Food Delivery Features
Choose CP or AP for each:
- Restaurant menu and prices
- Order placement and payment
- Delivery person live location
- Likes on restaurant review
Reference answer:
- Menu and prices -> CP (price mismatch causes disputes/loss)
- Order + payment -> CP (correctness is mandatory)
- Live location -> AP (slight delay is acceptable)
- Likes count -> AP (eventual consistency is fine)
Lesson 4: Availability vs Consistency (Deep Dive)
CAP gives the trade-off. This lesson makes it measurable and implementable.
Availability Is Measured in “Nines”
| Availability | Downtime/year | Downtime/month |
|---|---|---|
| 99% | 3.65 days | 7.3 hours |
| 99.9% | 8.7 hours | 43 minutes |
| 99.99% | 52 minutes | 4.3 minutes |
| 99.999% | 5 minutes | 26 seconds |
One extra nine looks small but dramatically changes architecture cost and complexity.
Consistency Is a Spectrum
Not all systems need strong consistency everywhere.
-
Strong consistency
- Every read gets latest write
- Use for money, inventory, booking
-
Eventual consistency
- Replicas converge later
- Use for feeds, counters, non-critical read paths
-
Read-your-own-writes
- A user sees their own update immediately
- Great for comments, profile edits, content creation UX
-
Causal consistency
- Related events preserve logical order
- Useful in collaboration/chat/threaded interactions
Consistency Zones in One Product
A single product usually uses multiple models:
- Payment flow -> CP + strong consistency
- Live map updates -> AP + eventual consistency
- User-authored content view -> read-your-own-writes
Do not choose one global model for all features.
Exercise: Google Docs Style Editor
Scenario: multiple users edit same doc in near real time.
Questions:
- Latency-sensitive or throughput-sensitive?
- Main bottleneck at scale?
- CP or AP?
- Which consistency model for document updates?
Reference answer (practical):
- Both latency and throughput matter (typing UX + many concurrent edits)
- Bottlenecks: collaboration servers and write coordination/storage path
- Prefer AP for editing continuity
- Use mix of read-your-own-writes + causal ordering + eventual convergence
Key Takeaways
- CAP is activated during network partition; then C vs A is a forced choice.
- Partition tolerance is non-negotiable in real distributed systems.
- Pick CP/AP per feature based on business cost of wrong data.
- Availability targets (“nines”) must be explicit before architecture decisions.
- Consistency is a spectrum; strong consistency everywhere is often unnecessary and expensive.
Part of the system design fundamentals series. Next: load balancers, caching layers, databases, queues, and CDN placement decisions.