Why Cassandra Over MongoDB, and How Databases Scale

Why Cassandra Kept Winning

Looking across all the system designs, Cassandra appeared repeatedly. The pattern behind every choice:

System	Access pattern	Why Cassandra
URL Shortener	`slug → long_url`	Simple key lookup, 115k reads/sec
Notification System	All notifications for user X by time	Time series, 2B writes/day
Feed System	Posts by user X by time	Time series, write heavy
Ride Sharing	Location pings for ride X	Extreme write throughput, 333k/sec

Every time Cassandra won, three things were true:

Access pattern was known upfront — “give me records for this key”
Scale was massive — billions of records, thousands of writes/sec
Data was time series or append-only — mostly inserts, rarely updates

When MongoDB Wins Instead

Use MongoDB when:

Data structure is unpredictable — product catalog where iPhone has storage/colour options and a T-shirt has size/colour options. Document model fits naturally.
You need rich ad-hoc queries — “find products under ₹500 with rating > 4 in electronics with free delivery.” MongoDB’s query engine handles this; Cassandra can only query by partition key.
Moderate scale with complex structure — millions not billions, frequent schema changes during development.

The Decision Rule

1. Do I know exactly how I'll query this data?
   Yes → Cassandra (optimise for that query)
   No  → MongoDB (flexible queries)

2. What's my write volume?
   Millions+ per day → Cassandra
   Thousands per day → either works

Replication — Surviving Failures

Replication = keeping multiple copies of data on different machines.

Master-Slave

Primary (Master) → handles all writes
         ↓ replicates to
Replica 1 (Slave) → reads only
Replica 2 (Slave) → reads only

Good for read-heavy systems. Primary dies → replica promoted (30-60 sec downtime).

Multi-Master (Cassandra’s model)

Primary 1 ←sync→ Primary 2 ←sync→ Primary 3
All three accept reads and writes

No single point of failure. Any node dying doesn’t affect writes. Trade-off: conflict resolution needed (last write wins) — this is the AP choice from CAP theorem.

Replication Factor

Factor 1 → no redundancy (never do this)
Factor 2 → one failure tolerated
Factor 3 → industry standard — two simultaneous failures tolerated

In Cassandra with factor 3: write confirmed when 2 of 3 nodes acknowledge.

Sharding — When Data Is Too Big for One Machine

Even with replication, one primary has limits. Sharding = splitting data across multiple database instances. Each instance (shard) holds a subset.

Without sharding: one DB holds all 500M users

With sharding:
Shard 1 → users 1 to 125M
Shard 2 → users 125M to 250M
Shard 3 → users 250M to 375M
Shard 4 → users 375M to 500M

Choosing a Shard Key

Range-based — by user ID range. Problem: new users always get high IDs → all new writes hit last shard → hot shard.

Hash-based ✅

shard_number = hash(userId) % number_of_shards

userId 1001 → hash → Shard 3
userId 1002 → hash → Shard 1
Distribution is uniform. No hot shards.

Geographic — India users → India shard, US users → US shard. Good for latency and GDPR. Problem: uneven load if one region is much larger.

The Cross-Shard Query Problem

Simple query (works great):
"Give me user 1001" → hash(1001) → Shard 3 → done ✅

Complex query (breaks):
"All users who signed up last month with 10+ orders"
→ Must query every shard → merge results
→ Called scatter-gather — expensive ❌

Sharding forces you to design queries around your shard key.

How the Systems Actually Used These

System	Shard key	Replication
URL Shortener	`slug` (hash)	Factor 3
Notification System	`recipient_id`	Factor 3
Feed System	`user_id`	Factor 3
Ride Sharing (PostgreSQL)	city (geographic)	Master-slave

Replication + Sharding Together

3 shards × replication factor 3 = 9 total nodes

Shard 1 → Primary 1 + Replica 1a + Replica 1b
Shard 2 → Primary 2 + Replica 2a + Replica 2b
Shard 3 → Primary 3 + Replica 3a + Replica 3b

Sharding handles scale. Replication handles availability. Start with replication first — it’s simpler. Add sharding only when a single server genuinely can’t handle the load.

The Interview Line

“I’d use Cassandra with replication factor 3 and partition key as userId. This gives fault tolerance through replication and horizontal scale through consistent hash partitioning.”