Table of contents
Open Table of contents
Introduction
SQL vs NoSQL is not about preference, familiarity, or which is “better.” It is purely about your access patterns and data structure. Once you see it this way, the choice becomes obvious every time.
Before choosing a database, ask three questions:
- What does my data look like? → Structured and relational, or flexible and varied?
- How will I access it? → Simple lookups, complex joins, or massive scale reads?
- What guarantees do I need? → Strong consistency or eventual consistency?
Your answers determine the database. Not habit.
SQL Databases
Examples: PostgreSQL, MySQL, SQLite
What Makes SQL Special
Data lives in tables with enforced relationships. One query can join multiple tables.
Users Table Orders Table
----------- ------------
id | name | email id | user_id | product | amount
1 | Raj | r@r.com 1 | 1 | iPhone | 79999
2 | Priya| p@p.com 2 | 1 | AirPods | 19999
3 | 2 | MacBook | 129999
“Show all orders by users who signed up in the last 30 days and spent more than ₹50,000” — one SQL query across both tables.
ACID Guarantees
This is why financial systems use SQL.
- Atomicity: A transaction fully completes or fully fails — never half done. Transfer ₹10,000: both debit and credit happen, or neither does.
- Consistency: Data always moves between valid states. Balance cannot go below zero.
- Isolation: Two simultaneous transactions don’t interfere. Two people booking the last seat — only one succeeds.
- Durability: Once committed, data survives crashes. “Payment successful” is permanent even if server dies 1ms later.
Choose SQL When
- Data has clear relationships (users, orders, products, payments)
- You need complex queries and reporting
- Consistency is non-negotiable (banking, bookings, inventory)
- Data structure is stable and well-defined
NoSQL Databases
NoSQL is not one thing — it’s a family of databases each designed for a specific problem SQL handles poorly.
Type 1: Document Stores
Examples: MongoDB, CouchDB
Data stored as flexible JSON-like documents. Each document can have different fields.
{
"id": "u1",
"name": "Raj",
"address": { "city": "Mumbai", "pincode": "400001" },
"preferences": ["cricket", "movies"],
"social": { "twitter": "@raj" }
}
In SQL, this would need 4 tables and complex joins. Here it’s one document, one read.
Use when: Data is hierarchical, nested, or varies between records. User profiles, restaurant menus, content management.
MongoDB can do joins via $lookup — but it’s not what the engine is optimised for. If your data
constantly needs joins to be useful, you’re working against the tool. SQL JOINs are native and
index-optimised at scale; MongoDB $lookup gets slower as collections grow.
Type 2: Key-Value Stores
Examples: Redis, DynamoDB
Simplest structure. A key maps to a value. Extremely fast.
"session:user123" → { userId: 123, role: "admin", expiry: ... }
"product:iphone15" → { price: 79999, stock: 234 }
"otp:9876543210" → "459821"
No complex queries. No relationships. Just set and get.
Use when: Caching, sessions, OTPs, rate limiting, real-time counters. Anything where you know exactly what key to look up.
Type 3: Column Family Stores
Examples: Cassandra, HBase
Data stored by column rather than row. Designed for massive write throughput across distributed nodes. Reading one column across millions of rows is extremely fast.
Use when: Time series data, event logging, analytics, IoT sensor data. Anything write-heavy at massive scale.
Type 4: Graph Databases
Examples: Neo4j, Amazon Neptune
Data stored as nodes and relationships.
(Raj) -[FRIENDS_WITH]→ (Priya)
(Raj) -[FOLLOWS]→ (Virat Kohli)
(Priya) -[LIKES]→ (Post: "Sunset photo")
“Friends of friends who like similar posts” is trivial here. In SQL it would require many complex joins and degrade badly at scale.
Use when: Social networks, recommendation engines, fraud detection, knowledge graphs.
The Access Pattern Decision Framework
Is data relational with complex queries? → SQL
Need ACID transactions? → SQL
Flexible / varying structure per record? → Document store (MongoDB)
Extreme speed with simple lookups? → Key-Value (Redis, DynamoDB)
Time series or event data at massive scale? → Column family (Cassandra)
Highly connected data with traversal queries? → Graph (Neo4j)
Polyglot Persistence — The Real World Pattern
Production systems almost never use just one database. They use the right database for each type of data.
A system like Swiggy likely uses:
PostgreSQL → Orders, payments, user accounts
(ACID, relationships, complex queries)
MongoDB → Restaurant menus, food item details
(flexible structure, varies per restaurant)
Redis → Sessions, caching, OTPs, rate limiting
(speed is the only requirement)
Cassandra → Delivery tracking events, analytics logs
(massive write throughput, time series)
Each database doing what it’s best at. This is called polyglot persistence.
Exercise: Ride Sharing App
You are designing a ride-sharing app like Ola. Choose the right database for each:
- User accounts, payment methods, and ride history
- Driver real-time location — updating every 3 seconds
- Route graph of roads and connections between locations
- Active session tokens for logged-in users
- Surge pricing events and ride request logs for analytics
Reference answer:
User accounts + payment methods
→ PostgreSQL (SQL)
ACID, relationships, complex queries
Ride history
→ PostgreSQL (SQL) — same database
Relational to users and payments
Needs joins and transaction guarantees
Driver real-time location
→ Redis with Geospatial support
Updates every 3 seconds — speed critical
redis.geoadd / redis.georadius for nearby driver queries
Route graph
→ Neo4j (Graph database)
Roads and intersections are nodes and edges
Shortest path and alternate routes are native operations
Active session tokens
→ Redis (Key-Value)
Fast lookup, automatic TTL expiry
Surge pricing events + ride logs
→ Cassandra (Column Family)
Time series, massive write throughput
Analytical reads across time ranges
Connection to Previous Lessons
- From CAP Theorem: SQL → CP. Cassandra → AP. MongoDB → configurable.
- From Caching: Redis blurs the line between cache and database. For flash sale prices, Redis is the primary store. For product details, it’s the cache layer over PostgreSQL. The label matters less than understanding what guarantee the tool gives you.
Key Takeaways
- SQL vs NoSQL is an access pattern decision, not a preference.
- SQL is built for relationships, complex queries, and ACID guarantees.
- NoSQL is a family: document, key-value, column family, graph — each solves a specific problem.
- Real systems use multiple databases — match each data type to the right tool.
- MongoDB can do joins, but if your data constantly needs them, use SQL instead.
Part of the system design series. Next: Message Queues — how services communicate without depending on each other.