System Design Databases: SQL vs NoSQL

Published: at 08:30 AM
(6 min read)

Table of contents

Open Table of contents

Introduction

SQL vs NoSQL is not about preference, familiarity, or which is “better.” It is purely about your access patterns and data structure. Once you see it this way, the choice becomes obvious every time.

Before choosing a database, ask three questions:

  1. What does my data look like? → Structured and relational, or flexible and varied?
  2. How will I access it? → Simple lookups, complex joins, or massive scale reads?
  3. What guarantees do I need? → Strong consistency or eventual consistency?

Your answers determine the database. Not habit.

SQL Databases

Examples: PostgreSQL, MySQL, SQLite

What Makes SQL Special

Data lives in tables with enforced relationships. One query can join multiple tables.

Users Table            Orders Table
-----------            ------------
id | name | email      id | user_id | product | amount
1  | Raj  | r@r.com    1  | 1       | iPhone  | 79999
2  | Priya| p@p.com    2  | 1       | AirPods | 19999
                       3  | 2       | MacBook | 129999

“Show all orders by users who signed up in the last 30 days and spent more than ₹50,000” — one SQL query across both tables.

ACID Guarantees

This is why financial systems use SQL.

Choose SQL When

NoSQL Databases

NoSQL is not one thing — it’s a family of databases each designed for a specific problem SQL handles poorly.

Type 1: Document Stores

Examples: MongoDB, CouchDB

Data stored as flexible JSON-like documents. Each document can have different fields.

{
  "id": "u1",
  "name": "Raj",
  "address": { "city": "Mumbai", "pincode": "400001" },
  "preferences": ["cricket", "movies"],
  "social": { "twitter": "@raj" }
}

In SQL, this would need 4 tables and complex joins. Here it’s one document, one read.

Use when: Data is hierarchical, nested, or varies between records. User profiles, restaurant menus, content management.

MongoDB can do joins via $lookup — but it’s not what the engine is optimised for. If your data constantly needs joins to be useful, you’re working against the tool. SQL JOINs are native and index-optimised at scale; MongoDB $lookup gets slower as collections grow.

Type 2: Key-Value Stores

Examples: Redis, DynamoDB

Simplest structure. A key maps to a value. Extremely fast.

"session:user123"   → { userId: 123, role: "admin", expiry: ... }
"product:iphone15"  → { price: 79999, stock: 234 }
"otp:9876543210"    → "459821"

No complex queries. No relationships. Just set and get.

Use when: Caching, sessions, OTPs, rate limiting, real-time counters. Anything where you know exactly what key to look up.

Type 3: Column Family Stores

Examples: Cassandra, HBase

Data stored by column rather than row. Designed for massive write throughput across distributed nodes. Reading one column across millions of rows is extremely fast.

Use when: Time series data, event logging, analytics, IoT sensor data. Anything write-heavy at massive scale.

Type 4: Graph Databases

Examples: Neo4j, Amazon Neptune

Data stored as nodes and relationships.

(Raj) -[FRIENDS_WITH]→ (Priya)
(Raj) -[FOLLOWS]→ (Virat Kohli)
(Priya) -[LIKES]→ (Post: "Sunset photo")

“Friends of friends who like similar posts” is trivial here. In SQL it would require many complex joins and degrade badly at scale.

Use when: Social networks, recommendation engines, fraud detection, knowledge graphs.

The Access Pattern Decision Framework

Is data relational with complex queries?      → SQL
Need ACID transactions?                       → SQL
Flexible / varying structure per record?      → Document store (MongoDB)
Extreme speed with simple lookups?            → Key-Value (Redis, DynamoDB)
Time series or event data at massive scale?   → Column family (Cassandra)
Highly connected data with traversal queries? → Graph (Neo4j)

Polyglot Persistence — The Real World Pattern

Production systems almost never use just one database. They use the right database for each type of data.

A system like Swiggy likely uses:

PostgreSQL  → Orders, payments, user accounts
             (ACID, relationships, complex queries)

MongoDB     → Restaurant menus, food item details
             (flexible structure, varies per restaurant)

Redis       → Sessions, caching, OTPs, rate limiting
             (speed is the only requirement)

Cassandra   → Delivery tracking events, analytics logs
             (massive write throughput, time series)

Each database doing what it’s best at. This is called polyglot persistence.

Exercise: Ride Sharing App

You are designing a ride-sharing app like Ola. Choose the right database for each:

  1. User accounts, payment methods, and ride history
  2. Driver real-time location — updating every 3 seconds
  3. Route graph of roads and connections between locations
  4. Active session tokens for logged-in users
  5. Surge pricing events and ride request logs for analytics

Reference answer:

User accounts + payment methods
→ PostgreSQL (SQL)
   ACID, relationships, complex queries

Ride history
→ PostgreSQL (SQL) — same database
   Relational to users and payments
   Needs joins and transaction guarantees

Driver real-time location
→ Redis with Geospatial support
   Updates every 3 seconds — speed critical
   redis.geoadd / redis.georadius for nearby driver queries

Route graph
→ Neo4j (Graph database)
   Roads and intersections are nodes and edges
   Shortest path and alternate routes are native operations

Active session tokens
→ Redis (Key-Value)
   Fast lookup, automatic TTL expiry

Surge pricing events + ride logs
→ Cassandra (Column Family)
   Time series, massive write throughput
   Analytical reads across time ranges

Connection to Previous Lessons

Key Takeaways

  1. SQL vs NoSQL is an access pattern decision, not a preference.
  2. SQL is built for relationships, complex queries, and ACID guarantees.
  3. NoSQL is a family: document, key-value, column family, graph — each solves a specific problem.
  4. Real systems use multiple databases — match each data type to the right tool.
  5. MongoDB can do joins, but if your data constantly needs them, use SQL instead.

Part of the system design series. Next: Message Queues — how services communicate without depending on each other.