System Design Traffic and Cache: Load Balancers and Redis

Published: at 08:15 AM
(4 min read)

Table of contents

Open Table of contents

Introduction

Once traffic grows, two problems appear quickly:

  1. How do you distribute requests across many servers?
  2. How do you stop the database from being hit for repeated reads?

Load balancers solve the first. Caching (often with Redis) solves the second.

Lesson 5: Load Balancers

What Problem They Solve

Without a load balancer, traffic can overload one server while others sit idle.

With a load balancer, requests are distributed across multiple servers so capacity scales horizontally.

Core Routing Algorithms

  1. Round Robin

    • Send requests in sequence: S1 -> S2 -> S3 -> S1…
    • Good when servers are identical and requests are similar duration.
  2. Weighted Round Robin

    • Higher-capacity servers get more traffic.
    • Good when server sizes differ.
  3. Least Connections

    • Send next request to server with fewest active connections.
    • Best for variable-duration requests (for example, order processing).
  4. IP Hash

    • Same client IP tends to hit same backend.
    • Useful for sticky sessions or localized cache behavior.

Placement in Architecture

Load balancing usually happens at multiple layers:

Avoiding Single Point of Failure

A load balancer can itself fail, so it also needs redundancy:

Layer 4 vs Layer 7

Most modern microservice architectures use Layer 7 at the edge.

Exercise: Food Delivery Services

Scenario: User Service, Restaurant Service, and Order Service. Order Service is heavier and more variable.

Questions:

  1. Layer 4 or Layer 7, and why?
  2. Best algorithm for Order Service?
  3. Where should load balancers be placed?

Reference answer:

Lesson 6: Caching with Redis

What Problem Caching Solves

Many reads ask for the same data repeatedly. Without caching, every request hits the database.

Caching reduces:

Cache Placement Options

Each serves a different layer of latency/load reduction.

Three Core Caching Strategies

  1. Cache Aside (lazy)

    • App reads cache first; on miss reads DB and populates cache.
    • Most common default for read-heavy paths.
  2. Write Through

    • Write DB and cache together.
    • Better freshness, slightly slower writes.
  3. Write Behind

    • Write cache first, persist to DB asynchronously.
    • Very fast writes, but needs strong durability controls.

TTL Is a Design Decision

TTL (time-to-live) controls freshness vs load relief.

Pick TTL by business cost of stale data.

What Not to Cache

Common Cache Failure Patterns

  1. Cache stampede: many requests miss at once -> DB spike
    Mitigate with locking/single-flight refresh.

  2. Cache penetration: repeated requests for non-existent keys
    Mitigate by caching null/negative results briefly.

  3. Cache avalanche: many keys expire simultaneously
    Mitigate with TTL jitter/randomized expiry.

Exercise: E-commerce Product Page

Data fields: product details, current price, stock availability, customer reviews.

Questions:

  1. Should each be cached?
  2. Which strategy should be used?
  3. What TTL makes sense?

Reference answer (practical baseline):

Flash Sale Nuance

If price updates are frequent and read traffic is extreme, bypassing cache and hitting the DB directly will flood it.

The polling problem:

Normal polling:
User → "What's the price?" → Server → every second → 10M requests/sec
                                                      DB dies

The fix — push instead of poll:

WebSocket approach:
Server → "Price changed to ₹999" → all connected users simultaneously
→ 1 write event, 10 million users updated
→ Zero requests from users

Complete flash sale architecture:

Price changes every few seconds

Write to Redis (primary store for price)
Write to Message Queue (for async DB sync)

10M users connected via WebSocket

Price change event pushed to all users simultaneously

No polling. No DB flood.
Redis handles reads. DB updated from queue in background.

The mindset shift: for high-churn data with massive reads, Redis is not the cache — Redis is the source of truth. The DB becomes the async backup.

Key Takeaways

  1. Load balancers enable horizontal scaling, but they also need redundancy.
  2. Layer 7 routing is usually the right fit for microservice request routing.
  3. Least Connections is strong for uneven request durations.
  4. Caching is a business trade-off between freshness and performance.
  5. TTL, invalidation, and failure-mode handling matter more than just adding Redis.

Part of the system design series. Next: SQL vs NoSQL, message queues, and CDN decision patterns.