Why Cassandra Over MongoDB, and How Databases Scale

·
system-design databases cassandra mongodb

Why Cassandra Kept Winning

Looking across all the system designs, Cassandra appeared repeatedly. The pattern behind every choice:

SystemAccess patternWhy Cassandra
URL Shortenerslug → long_urlSimple key lookup, 115k reads/sec
Notification SystemAll notifications for user X by timeTime series, 2B writes/day
Feed SystemPosts by user X by timeTime series, write heavy
Ride SharingLocation pings for ride XExtreme write throughput, 333k/sec

Every time Cassandra won, three things were true:

  1. Access pattern was known upfront — “give me records for this key”
  2. Scale was massive — billions of records, thousands of writes/sec
  3. Data was time series or append-only — mostly inserts, rarely updates

When MongoDB Wins Instead

Use MongoDB when:

The Decision Rule

1. Do I know exactly how I'll query this data?
   Yes → Cassandra (optimise for that query)
   No  → MongoDB (flexible queries)

2. What's my write volume?
   Millions+ per day → Cassandra
   Thousands per day → either works

Replication — Surviving Failures

Replication = keeping multiple copies of data on different machines.

Master-Slave

Primary (Master) → handles all writes
         ↓ replicates to
Replica 1 (Slave) → reads only
Replica 2 (Slave) → reads only

Good for read-heavy systems. Primary dies → replica promoted (30-60 sec downtime).

Multi-Master (Cassandra’s model)

Primary 1 ←sync→ Primary 2 ←sync→ Primary 3
All three accept reads and writes

No single point of failure. Any node dying doesn’t affect writes. Trade-off: conflict resolution needed (last write wins) — this is the AP choice from CAP theorem.

Replication Factor

Factor 1 → no redundancy (never do this)
Factor 2 → one failure tolerated
Factor 3 → industry standard — two simultaneous failures tolerated

In Cassandra with factor 3: write confirmed when 2 of 3 nodes acknowledge.


Sharding — When Data Is Too Big for One Machine

Even with replication, one primary has limits. Sharding = splitting data across multiple database instances. Each instance (shard) holds a subset.

Without sharding: one DB holds all 500M users

With sharding:
Shard 1 → users 1 to 125M
Shard 2 → users 125M to 250M
Shard 3 → users 250M to 375M
Shard 4 → users 375M to 500M

Choosing a Shard Key

Range-based — by user ID range. Problem: new users always get high IDs → all new writes hit last shard → hot shard.

Hash-based

shard_number = hash(userId) % number_of_shards

userId 1001 → hash → Shard 3
userId 1002 → hash → Shard 1
Distribution is uniform. No hot shards.

Geographic — India users → India shard, US users → US shard. Good for latency and GDPR. Problem: uneven load if one region is much larger.

The Cross-Shard Query Problem

Simple query (works great):
"Give me user 1001" → hash(1001) → Shard 3 → done ✅

Complex query (breaks):
"All users who signed up last month with 10+ orders"
→ Must query every shard → merge results
→ Called scatter-gather — expensive ❌

Sharding forces you to design queries around your shard key.


How the Systems Actually Used These

SystemShard keyReplication
URL Shortenerslug (hash)Factor 3
Notification Systemrecipient_idFactor 3
Feed Systemuser_idFactor 3
Ride Sharing (PostgreSQL)city (geographic)Master-slave

Replication + Sharding Together

3 shards × replication factor 3 = 9 total nodes

Shard 1 → Primary 1 + Replica 1a + Replica 1b
Shard 2 → Primary 2 + Replica 2a + Replica 2b
Shard 3 → Primary 3 + Replica 3a + Replica 3b

Sharding handles scale. Replication handles availability. Start with replication first — it’s simpler. Add sharding only when a single server genuinely can’t handle the load.


The Interview Line

“I’d use Cassandra with replication factor 3 and partition key as userId. This gives fault tolerance through replication and horizontal scale through consistent hash partitioning.”