System Design: Feed System (Twitter/Instagram)

Open Table of contents

Introduction
Step 1: Requirements
Step 2: Scale Estimation
Step 3: The Core Problem — Two Approaches
Step 4: Data Model
Step 5: Full Architecture
- Write Path
- Read Path
Step 6: Three Levels of Caching
Step 7: Edge Cases
Feature Extension: Ranked Feed
Feature Extension: Stories
Complete Architecture
Everything Connects
Key Takeaways

Introduction

A feed system is what powers Twitter, Instagram, and Facebook. It’s also where the hardest scaling problem in social media lives — showing 500 million users a personalised feed in under 100ms. The key challenge isn’t storage or bandwidth. It’s the fundamental tension between write speed and read speed.

Step 1: Requirements

Functional:

User sees a feed of posts from people they follow
Sorted roughly by recency
New posts from followed users appear in feed
Infinite scroll — load more as user scrolls
Supports text, images, videos

Non-functional:

Feed load under 100ms
High availability
Eventual consistency — slight feed delay is acceptable
500M daily active users
Celebrity problem — some users have 250M followers

Step 2: Scale Estimation

500M daily active users
Posts created/day: 100M = ~1,200 writes/second

Feed reads:
500M users × 10 feed opens/day = 5B reads/day
= ~58,000 reads/second

Read:Write ratio = ~50:1 — extremely read-heavy

This immediately tells you:

Optimise aggressively for reads
Writes can be slower and more complex
Pre-computation is necessary
Caching is critical

Step 3: The Core Problem — Two Approaches

Approach 1: Pull Model (Fan-out on Read)

Compute feed on the fly when user opens it.

User opens feed
→ Fetch all 200 followees
→ Query DB for recent posts from each
→ Merge and sort 4,000 posts
→ Return top 20

Problem: at 58,000 feed opens/second this means millions of DB queries per second. Database collapses immediately.

Works when: user follows very few people, low traffic, content changes so fast caching is useless.

Approach 2: Push Model (Fan-out on Write)

Pre-compute and deliver to follower feeds when a post is created.

User u123 posts
→ Fetch all 1,000 followers
→ Insert post into each follower's feed cache
→ Done — feeds already prepared

User opens feed
→ Just read pre-computed feed from cache → milliseconds

Problem — the celebrity problem:

Virat Kohli posts
→ 250M followers
→ 250M write operations
→ Takes hours to complete fan-out
→ Followers see post hours late

Works when: users have moderate follower counts, read speed is critical, write delay is acceptable.

The Real Solution: Hybrid Model

This is what Twitter, Instagram, and Facebook actually use.

Classify users:

Normal users (< 10K followers) → push model — fan-out on write immediately
Celebrity users (> 10K followers) → pull model — NOT pushed to feeds

Feed read flow:

User opens feed
→ Read pre-computed feed from Redis (normal users' posts) → fast
→ Fetch recent posts from celebrities they follow → separate query
→ Merge both lists → sort by timestamp
→ Return unified feed under 100ms ✅

Step 4: Data Model

Posts — Cassandra:

post_id       → time-based unique ID (sortable)
user_id       → author
content       → text
media_urls    → array
created_at    → timestamp
like_count    → counter (via Redis, synced async)
comment_count → counter

Cassandra fits: 1,200 writes/second, time series, simple access pattern (posts by user_id).

Feed cache — Redis Sorted Set:

Key: "feed:userId"
Value: sorted set of post_ids, scored by timestamp
Max: 1,000 post IDs per user

redis.zadd("feed:u456", timestamp, postId)
redis.zrange("feed:u456", 0, 19)  → top 20 post IDs

Redis sorted sets are perfect: O(log n) insert, O(log n) range query, always sorted by score.

Follow relationships — PostgreSQL:

follower_id   → user who follows
followee_id   → user being followed
created_at    → timestamp

Needs indexes in both directions:

“all followers of u123” — for fan-out
“all users u456 follows” — for feed generation

Step 5: Full Architecture

Write Path

User creates post
      ↓
App Server
      ↓
1. Save post to Cassandra (persistent storage)
      ↓
2. Publish to Kafka: "post.created"
   { postId, userId, timestamp }
      ↓
Fan-out Service consumes event:
→ Fetch followers of userId
→ Celebrity? (> 10K followers)
     Yes → skip fan-out, mark as celebrity post
     No  → for each follower:
           zadd "feed:followerId" timestamp postId
           trim feed to 1,000 posts

Read Path

User requests feed
      ↓
1. Fetch top 200 post IDs from Redis "feed:userId"
      ↓
2. Fetch celebrity posts separately
   (get celebrities user follows → recent posts from Cassandra)
      ↓
3. Merge + sort by timestamp → take top 20
      ↓
4. Batch fetch post details
   → Redis post cache (hit)
   → Cassandra (miss only) → store in Redis
      ↓
5. Return feed

Step 6: Three Levels of Caching

Level 1 — Feed cache (Redis sorted set):

"feed:userId" → sorted post IDs
No expiry — updated on every new post
Trimmed to 1,000 posts max

Level 2 — Post cache (Redis hash):

"post:postId" → full post data
Cache Aside strategy
TTL: 24 hours

Level 3 — User profile cache (Redis):

"user:userId" → username, avatar, verified
TTL: 1 hour

Without Level 3: fetching 20 posts = 20 separate profile DB lookups. With cache: all from Redis in one pass.

Step 7: Edge Cases

New user — empty feed:

No follows yet → show trending posts, suggested users, popular content by interest category. Computed separately, not from follow graph.

User returns after long absence:

Feed cache expired → fall back to pull model just once, reconstruct feed, populate Redis. Future reads served from cache.

Post deletion:

Can’t immediately purge from millions of Redis caches. Use soft delete — mark deleted in Cassandra, filter on read, Redis cache expires naturally. Eventual consistency is acceptable.

Like/comment counts at scale:

1M likes in an hour = 1M DB writes. Instead:

→ Increment count in Redis instantly (write behind)
→ Async job syncs Redis count to Cassandra every 60 seconds
→ Reads from Redis always — fast
→ DB eventually consistent — acceptable

Feature Extension: Ranked Feed

The ask: show posts the user is most likely to engage with first (Instagram-style algorithm).

Wrong approach: rank during fan-out.
Fan-out happens when post is brand new — zero likes, zero comments, no engagement signal yet. Ranking at write time is inaccurate.

Right approach: rank at read time.

User opens feed
→ Fetch 200 post IDs from Redis (chronological)
→ Pass to Ranking Service
→ Ranking Service scores each post:
  Score = f(
    relationship_score,  → how often you interact with this author
    post_engagement,     → current likes/comments
    recency,             → how recent
    content_affinity,    → your past engagement patterns
  )
→ Return top 20 ranked posts

New components:

Ranking Service — stateless, horizontally scalable, runs on feed fetch
User interest model (Redis) — which authors/content types you engage with, updated async
Post engagement cache (Redis) — real-time like/comment counts per post

The trade-off: ranking adds latency to feed reads. Before ranking: under 10ms. After ranking: up to 100ms. You trade speed for relevance.

Feature Extension: Stories

The ask: posts that disappear after 24 hours, shown at top of feed separately.

No new database needed. Stories are posts with two differences: they expire and they display differently. Extend the existing posts table:

type          → "POST" | "STORY"     (new)
expires_at    → null for posts, created_at + 24h for stories (new)

Expiry — use Cassandra’s native TTL:

INSERT INTO posts (...) USING TTL 86400

Cassandra automatically deletes after 86400 seconds. No worker, no extra logic.

Story feed — separate read query:

SELECT * FROM posts
WHERE type = "STORY"
AND user_id IN (users I follow)
AND expires_at > now()
ORDER BY created_at DESC

Story cache:

Redis key: "stories:userId"
TTL: 5 minutes
5-minute staleness is acceptable for stories

Story views:

story_views (Cassandra):
  story_id  → which story
  viewer_id → who viewed
  viewed_at → timestamp
  TTL: 25 hours (auto-cleans with story)

Complete Architecture

[User Creates Post]
      ↓
[Load Balancer] → [App Servers]
      ↓                ↓
[Cassandra]       [Kafka "post.created"]
(post stored)          ↓
               [Fan-out Service]
               Normal → push to Redis feeds
               Celebrity → skip, pull on read
                    ↓
             [Redis Cluster]
             feed:userId → sorted post IDs
             post:postId → full post data
             user:userId → profile cache

[User Reads Feed]
      ↓
[Load Balancer] → [App Servers]
      ↓
[Redis] → pre-computed feed IDs
      ↓
[Ranking Service] → scores posts
      ↓
[Redis] → post details (cache hit)
      ↓ (miss only)
[Cassandra] → post details
      ↓
[Celebrity posts fetched separately, merged]
      ↓
[Return unified ranked feed under 100ms]

[Images/Videos]
→ Stored in S3
→ Served via CDN globally
→ Never hits app servers

Everything Connects

Concept	How It’s Used
Latency vs Throughput (L1)	Feed reads = latency sensitive; fan-out = throughput sensitive
Scalability (L2)	Fan-out workers scale horizontally
CAP Theorem (L3)	Feed = AP (eventual); post storage = CP
Consistency (L4)	Read-your-own-writes for your own posts
Load Balancers (L5)	Distribute across app servers and fan-out workers
Caching (L6)	Three-level cache; write-behind for like counts
Databases (L7)	Cassandra (posts, time series), PostgreSQL (follows), Redis (feeds)
Message Queues (L8)	Kafka decouples post creation from fan-out
CDN (L9)	Images and videos served from edge

Key Takeaways

Pull model collapses at scale. Push model collapses for celebrities. Hybrid model solves both.
Redis sorted sets are purpose-built for feed storage — timestamp as score, always sorted.
The celebrity threshold (10K followers) is a product + engineering decision, not a fixed number.
Ranking must happen at read time, not write time — engagement signals don’t exist yet at fan-out.
Stories need no new database — extend the post model with type and TTL.
Cassandra native TTL is cleaner than a worker for time-based expiry.

Part of the system design series. Next: designing a ride-sharing system — location-based matching, real-time tracking, and dynamic pricing.