System Design: Ride Sharing System (Uber/Ola)

Open Table of contents

Introduction
Step 1: Requirements
Step 2: Scale Estimation
Step 3: Geohashing — The Key Concept
- Driver Location Storage in Redis
Step 4: Matching System
Step 5: Real-Time Location During Ride
Step 6: Surge Pricing
- The Formula
- Architecture
Step 7: Fare and Payment
Step 8: Data Model
Step 9: Full Architecture
Feature Extension: Scheduled Rides
Feature Extension: Carpooling
CAP Trade-offs Per Feature
Key Takeaways

Introduction

Ride sharing introduces something completely new — location as a first class data type. Every design decision revolves around geography and real-time movement. This is also where distributed locking becomes critical: two riders cannot be matched to the same driver at the same time.

Step 1: Requirements

Functional:

Rider requests a ride from A to B
System finds nearby available drivers
System matches rider to best driver
Driver accepts or rejects
Real-time location tracking during ride
Dynamic pricing (surge) during high demand
Fare calculation and payment at ride completion

Non-functional:

Match found in under 5 seconds
Location updates accurate within seconds
Payments must be consistent (no double charging)
10 million rides per day globally

Step 2: Scale Estimation

10M rides/day, peak ~500 rides/second

Active drivers sending location every 3 seconds:
1M active drivers / 3s = ~333,000 location writes/second

This is the hardest scaling problem in this system.

Step 3: Geohashing — The Key Concept

The world is divided into a grid. Each cell gets a string identifier called a geohash.

Precision level 5 → ~5km × 5km cells
Precision level 6 → ~1km × 1km cells ← useful for matching
Precision level 7 → ~150m × 150m cells

The critical property:

Locations that are geographically close share a common geohash prefix.

Driver 1 at Andheri West: geohash = "te7ud3"
Driver 2 at Andheri West: geohash = "te7ud4"
Driver 3 at Bandra:       geohash = "te7u8k"

Drivers 1+2 share prefix "te7ud" → they are close
Driver 3 shares prefix "te7u" → nearby area but further

Finding nearby drivers becomes a string prefix search instead of a geometric calculation.

Driver Location Storage in Redis

Every 3 seconds, driver app sends:
GEOADD "drivers:active" longitude latitude driverId

Finding drivers near a rider:
GEORADIUS "drivers:active" riderLong riderLat 2 "km"
→ Returns all driver IDs within 2km instantly

Redis geospatial commands use geohashing internally — O(log n) proximity search across millions of drivers.

Step 4: Matching System

Scoring Each Nearby Driver

Score = f(
  distance,          → closer is better
  driver_rating,     → higher rated preferred
  car_type_match,    → requested type only
  acceptance_rate,   → frequent rejectors ranked lower
  estimated_pickup,  → actual ETA to rider
)

Matching Flow

Rider requests ride at location L
      ↓
Matching Service
→ GEORADIUS: drivers within 3km → [driver1...driver20]
→ Score each driver
→ Offer to highest scored driver
→ Driver has 15 seconds to accept/reject
→ Accepted → match confirmed ✅
→ Rejected/timeout → offer to next driver
→ Repeat

The Lock Problem

Two riders cannot be matched to the same driver simultaneously:

Rider A and Rider B both near Driver X
Both matching services select Driver X simultaneously
Driver X gets two offers → accepts both → disaster ❌

Fix — distributed lock in Redis:

Before offering ride to Driver X:
SET "lock:driver:driverX" riderId NX EX 20
  NX → only set if key doesn't exist
  EX 20 → auto-expires in 20 seconds

Lock acquired → offer ride
Lock exists → driver taken, skip to next driver

Driver accepts → lock becomes permanent assignment
Driver rejects → delete lock → driver available again
Timeout → lock expires automatically → driver available

Step 5: Real-Time Location During Ride

Once matched, rider needs to see driver moving on map.

Polling vs WebSocket: unlike the notification counter (where polling was sufficient), real-time location during an active ride genuinely benefits from WebSocket. A 3-second polling delay on a live map feels broken to the user watching it.

Driver app → sends location every 3s → Kafka
Kafka → Location Service → updates Redis
Location Service → pushes to rider's WebSocket connection
Rider's map → updates smoothly

Step 6: Surge Pricing

The Formula

Surge multiplier = f(demand / supply in this geohash cell)

demand/supply ratio:
< 1.0  → normal pricing
1.0-1.5 → 1.2x
1.5-2.0 → 1.5x
2.0-3.0 → 2.0x
> 3.0   → 3.0x (capped)

Architecture

Every 60 seconds per geohash cell (level 5):

Surge Calculator Service:
→ Count ride requests in cell, last 5 minutes (from Kafka)
→ Count available drivers in cell (from Redis geospatial)
→ Calculate ratio
→ Store: Redis key "surge:geohash:te7ud" = 1.5, TTL 90s

When rider requests ride:
→ Get rider's geohash
→ Lookup Redis surge multiplier
→ Apply to fare, show upfront

Step 7: Fare and Payment

Fare = (base_fare
      + per_km_rate × distance
      + per_minute_rate × duration)
      × surge_multiplier

Payment:
→ Charge rider's saved payment method
→ PostgreSQL ACID transaction — no compromise
→ Deduct platform commission
→ Queue driver payout (Kafka → batch payout service)

Payment is the one component where eventual consistency is never acceptable.

Step 8: Data Model

rides — PostgreSQL:

ride_id, rider_id, driver_id
status → requested/matched/started/completed/cancelled
pickup_location, dropoff_location
requested_at, started_at, completed_at
fare, surge_multiplier, payment_status

Why PostgreSQL: ACID for payments, relational (rider + driver + payment), complex support/analytics queries, manageable volume.

driver_locations — Redis only (not persisted):

Real-time locations only — history not needed here
TTL: 30 seconds — no update = driver considered offline
Evicted automatically when driver goes offline

ride_location_history — Cassandra:

ride_id, timestamp, latitude, longitude

Every 3-second ping during ride stored here
Used for: dispute resolution, route verification
Time series, write-heavy → Cassandra

Step 9: Full Architecture

[Driver App]
Sends location every 3s
      ↓
[Location Service]
→ GEOADD to Redis
→ Publishes to Kafka "location.updated"
→ Writes to Cassandra (ride history)

[Rider App]
Requests ride
      ↓
[Ride Request Service]
→ Saves to PostgreSQL
→ Publishes to Kafka "ride.requested"
      ↓
[Matching Service]
→ GEORADIUS on Redis → nearby drivers
→ Score + rank drivers
→ Redis distributed lock on chosen driver
→ Send offer via WebSocket
      ↓
[Driver Accepts]
→ Ride status updated in PostgreSQL
→ Rider notified via WebSocket
→ Real-time tracking begins

[During Ride]
Driver location → Kafka → Location Service → Redis
Redis → WebSocket Server → Rider map updates

[Ride Completes]
→ Fare calculated
→ Payment via PostgreSQL ACID
→ Driver payout queued to Kafka
→ Rating prompts sent

[Surge Calculator] (runs every 60s)
→ Reads demand from Kafka
→ Reads supply from Redis
→ Writes surge multipliers to Redis

Feature Extension: Scheduled Rides

The ask: book a ride 2 hours in advance with guaranteed availability.

Problem with current matching: Redis geospatial only works for drivers available right now. Scheduled rides need drivers available at a future time.

New table:

driver_schedule (PostgreSQL):
driver_id, start_time, end_time, ride_id, status

Flow:

Rider books for 6:00 PM
→ Store in PostgreSQL, status: "pending_match"
→ No matching yet

At 5:30 PM, Scheduler Service triggers:
→ Find drivers:
   a) Active in city
   b) No entry in driver_schedule during 5:45–7:00 PM
   c) Within reasonable distance of pickup area
→ Offer ride with same lock mechanism as regular rides

Driver accepts:
→ Block driver from new rides starting 30 min before pickup
→ Regular ride requests blocked for this driver
→ Scheduled ride protected

The guarantee is enforced by blocking the driver’s window — holding a driver idle has real cost, which is why scheduled rides have cancellation fees.

Feature Extension: Carpooling

The ask: multiple riders going in similar directions share one car.

This breaks the core assumption of regular matching: one rider, one driver, immediate. Carpooling needs multi-party, route-aware, windowed matching.

Route Compatibility Check

Rider A: Andheri → Bandra (heading south)
Rider B: Juhu → Worli (heading south)

Compatible?
→ Both heading south ✅
→ Juhu is near Andheri ✅
→ Worli is near Bandra ✅
→ Detour to add B < 10% of original route ✅
→ Neither rider's journey extended > 20% ✅
→ Compatible ✅

Rider C: Andheri → Powai (heading east)
→ Incompatible — different direction entirely ❌

Carpool Matching Pool

Redis key: "carpool:pool:geohash:te7ud"
Value: list of pending carpool requests in this area
TTL: 3 minutes (matching window)

New carpool request:
→ Add to pool
→ Check existing pool for route-compatible riders
→ Compatible match found → lock in carpool group
→ Window expires → match whoever is in pool

Route Calculation Service (new component)

Computes route compatibility
Calculates optimal pickup order
Checks no rider’s journey extended > 20%
Integrates with Maps API

Updated Schema

rides additions:
carpool_group_id  → links riders in same car
pickup_order      → 1st or 2nd pickup
dropoff_order     → 1st or 2nd dropoff
original_eta      → quoted at booking
actual_eta        → real-time updated

Pricing

Solo ride Andheri → Bandra: ₹200
Carpool (2 riders):
→ Each rider pays ₹120 (saves ₹80)
→ Driver earns ₹240 (more than solo)
→ Platform earns same commission %

CAP Trade-offs Per Feature

Feature	Model	Reason
Driver location	AP	Slight staleness fine
Surge pricing	AP	60s staleness acceptable
Ride matching	CP	Can’t double-assign driver
Payment	CP	Consistency mandatory

Key Takeaways

Geohashing converts geographic proximity into a string prefix problem — fast, scalable.
Redis geospatial handles 333K location writes/second; use it as the real-time driver store, not a traditional database.
Distributed locks in Redis prevent double-matching. NX EX pattern = atomic lock with auto-expiry.
For active ride tracking, WebSocket is justified — visible map lag genuinely breaks UX.
Surge pricing uses geohash cells to compute demand/supply ratio per area in near real-time.
Payments are the one component where CP and ACID are non-negotiable.
Extending a system means identifying which assumption the new feature breaks, then addressing only that.

Part of the system design series. Next: designing a video streaming system — encoding pipelines, adaptive bitrate, and CDN at Hotstar scale.