Table of contents
Open Table of contents
- Introduction
- Step 1: Requirements
- Step 2: Scale Estimation
- Step 3: The Core Problem — Two Approaches
- Step 4: Data Model
- Step 5: Full Architecture
- Step 6: Three Levels of Caching
- Step 7: Edge Cases
- Feature Extension: Ranked Feed
- Feature Extension: Stories
- Complete Architecture
- Everything Connects
- Key Takeaways
Introduction
A feed system is what powers Twitter, Instagram, and Facebook. It’s also where the hardest scaling problem in social media lives — showing 500 million users a personalised feed in under 100ms. The key challenge isn’t storage or bandwidth. It’s the fundamental tension between write speed and read speed.
Step 1: Requirements
Functional:
- User sees a feed of posts from people they follow
- Sorted roughly by recency
- New posts from followed users appear in feed
- Infinite scroll — load more as user scrolls
- Supports text, images, videos
Non-functional:
- Feed load under 100ms
- High availability
- Eventual consistency — slight feed delay is acceptable
- 500M daily active users
- Celebrity problem — some users have 250M followers
Step 2: Scale Estimation
500M daily active users
Posts created/day: 100M = ~1,200 writes/second
Feed reads:
500M users × 10 feed opens/day = 5B reads/day
= ~58,000 reads/second
Read:Write ratio = ~50:1 — extremely read-heavy
This immediately tells you:
- Optimise aggressively for reads
- Writes can be slower and more complex
- Pre-computation is necessary
- Caching is critical
Step 3: The Core Problem — Two Approaches
Approach 1: Pull Model (Fan-out on Read)
Compute feed on the fly when user opens it.
User opens feed
→ Fetch all 200 followees
→ Query DB for recent posts from each
→ Merge and sort 4,000 posts
→ Return top 20
Problem: at 58,000 feed opens/second this means millions of DB queries per second. Database collapses immediately.
Works when: user follows very few people, low traffic, content changes so fast caching is useless.
Approach 2: Push Model (Fan-out on Write)
Pre-compute and deliver to follower feeds when a post is created.
User u123 posts
→ Fetch all 1,000 followers
→ Insert post into each follower's feed cache
→ Done — feeds already prepared
User opens feed
→ Just read pre-computed feed from cache → milliseconds
Problem — the celebrity problem:
Virat Kohli posts
→ 250M followers
→ 250M write operations
→ Takes hours to complete fan-out
→ Followers see post hours late
Works when: users have moderate follower counts, read speed is critical, write delay is acceptable.
The Real Solution: Hybrid Model
This is what Twitter, Instagram, and Facebook actually use.
Classify users:
- Normal users (< 10K followers) → push model — fan-out on write immediately
- Celebrity users (> 10K followers) → pull model — NOT pushed to feeds
Feed read flow:
User opens feed
→ Read pre-computed feed from Redis (normal users' posts) → fast
→ Fetch recent posts from celebrities they follow → separate query
→ Merge both lists → sort by timestamp
→ Return unified feed under 100ms ✅
Step 4: Data Model
Posts — Cassandra:
post_id → time-based unique ID (sortable)
user_id → author
content → text
media_urls → array
created_at → timestamp
like_count → counter (via Redis, synced async)
comment_count → counter
Cassandra fits: 1,200 writes/second, time series, simple access pattern (posts by user_id).
Feed cache — Redis Sorted Set:
Key: "feed:userId"
Value: sorted set of post_ids, scored by timestamp
Max: 1,000 post IDs per user
redis.zadd("feed:u456", timestamp, postId)
redis.zrange("feed:u456", 0, 19) → top 20 post IDs
Redis sorted sets are perfect: O(log n) insert, O(log n) range query, always sorted by score.
Follow relationships — PostgreSQL:
follower_id → user who follows
followee_id → user being followed
created_at → timestamp
Needs indexes in both directions:
- “all followers of u123” — for fan-out
- “all users u456 follows” — for feed generation
Step 5: Full Architecture
Write Path
User creates post
↓
App Server
↓
1. Save post to Cassandra (persistent storage)
↓
2. Publish to Kafka: "post.created"
{ postId, userId, timestamp }
↓
Fan-out Service consumes event:
→ Fetch followers of userId
→ Celebrity? (> 10K followers)
Yes → skip fan-out, mark as celebrity post
No → for each follower:
zadd "feed:followerId" timestamp postId
trim feed to 1,000 posts
Read Path
User requests feed
↓
1. Fetch top 200 post IDs from Redis "feed:userId"
↓
2. Fetch celebrity posts separately
(get celebrities user follows → recent posts from Cassandra)
↓
3. Merge + sort by timestamp → take top 20
↓
4. Batch fetch post details
→ Redis post cache (hit)
→ Cassandra (miss only) → store in Redis
↓
5. Return feed
Step 6: Three Levels of Caching
Level 1 — Feed cache (Redis sorted set):
"feed:userId" → sorted post IDs
No expiry — updated on every new post
Trimmed to 1,000 posts max
Level 2 — Post cache (Redis hash):
"post:postId" → full post data
Cache Aside strategy
TTL: 24 hours
Level 3 — User profile cache (Redis):
"user:userId" → username, avatar, verified
TTL: 1 hour
Without Level 3: fetching 20 posts = 20 separate profile DB lookups. With cache: all from Redis in one pass.
Step 7: Edge Cases
New user — empty feed:
No follows yet → show trending posts, suggested users, popular content by interest category. Computed separately, not from follow graph.
User returns after long absence:
Feed cache expired → fall back to pull model just once, reconstruct feed, populate Redis. Future reads served from cache.
Post deletion:
Can’t immediately purge from millions of Redis caches. Use soft delete — mark deleted in Cassandra, filter on read, Redis cache expires naturally. Eventual consistency is acceptable.
Like/comment counts at scale:
1M likes in an hour = 1M DB writes. Instead:
→ Increment count in Redis instantly (write behind)
→ Async job syncs Redis count to Cassandra every 60 seconds
→ Reads from Redis always — fast
→ DB eventually consistent — acceptable
Feature Extension: Ranked Feed
The ask: show posts the user is most likely to engage with first (Instagram-style algorithm).
Wrong approach: rank during fan-out.
Fan-out happens when post is brand new — zero likes, zero comments, no engagement signal yet.
Ranking at write time is inaccurate.
Right approach: rank at read time.
User opens feed
→ Fetch 200 post IDs from Redis (chronological)
→ Pass to Ranking Service
→ Ranking Service scores each post:
Score = f(
relationship_score, → how often you interact with this author
post_engagement, → current likes/comments
recency, → how recent
content_affinity, → your past engagement patterns
)
→ Return top 20 ranked posts
New components:
- Ranking Service — stateless, horizontally scalable, runs on feed fetch
- User interest model (Redis) — which authors/content types you engage with, updated async
- Post engagement cache (Redis) — real-time like/comment counts per post
The trade-off: ranking adds latency to feed reads. Before ranking: under 10ms. After ranking: up to 100ms. You trade speed for relevance.
Feature Extension: Stories
The ask: posts that disappear after 24 hours, shown at top of feed separately.
No new database needed. Stories are posts with two differences: they expire and they display differently. Extend the existing posts table:
type → "POST" | "STORY" (new)
expires_at → null for posts, created_at + 24h for stories (new)
Expiry — use Cassandra’s native TTL:
INSERT INTO posts (...) USING TTL 86400
Cassandra automatically deletes after 86400 seconds. No worker, no extra logic.
Story feed — separate read query:
SELECT * FROM posts
WHERE type = "STORY"
AND user_id IN (users I follow)
AND expires_at > now()
ORDER BY created_at DESC
Story cache:
Redis key: "stories:userId"
TTL: 5 minutes
5-minute staleness is acceptable for stories
Story views:
story_views (Cassandra):
story_id → which story
viewer_id → who viewed
viewed_at → timestamp
TTL: 25 hours (auto-cleans with story)
Complete Architecture
[User Creates Post]
↓
[Load Balancer] → [App Servers]
↓ ↓
[Cassandra] [Kafka "post.created"]
(post stored) ↓
[Fan-out Service]
Normal → push to Redis feeds
Celebrity → skip, pull on read
↓
[Redis Cluster]
feed:userId → sorted post IDs
post:postId → full post data
user:userId → profile cache
[User Reads Feed]
↓
[Load Balancer] → [App Servers]
↓
[Redis] → pre-computed feed IDs
↓
[Ranking Service] → scores posts
↓
[Redis] → post details (cache hit)
↓ (miss only)
[Cassandra] → post details
↓
[Celebrity posts fetched separately, merged]
↓
[Return unified ranked feed under 100ms]
[Images/Videos]
→ Stored in S3
→ Served via CDN globally
→ Never hits app servers
Everything Connects
| Concept | How It’s Used |
|---|---|
| Latency vs Throughput (L1) | Feed reads = latency sensitive; fan-out = throughput sensitive |
| Scalability (L2) | Fan-out workers scale horizontally |
| CAP Theorem (L3) | Feed = AP (eventual); post storage = CP |
| Consistency (L4) | Read-your-own-writes for your own posts |
| Load Balancers (L5) | Distribute across app servers and fan-out workers |
| Caching (L6) | Three-level cache; write-behind for like counts |
| Databases (L7) | Cassandra (posts, time series), PostgreSQL (follows), Redis (feeds) |
| Message Queues (L8) | Kafka decouples post creation from fan-out |
| CDN (L9) | Images and videos served from edge |
Key Takeaways
- Pull model collapses at scale. Push model collapses for celebrities. Hybrid model solves both.
- Redis sorted sets are purpose-built for feed storage — timestamp as score, always sorted.
- The celebrity threshold (10K followers) is a product + engineering decision, not a fixed number.
- Ranking must happen at read time, not write time — engagement signals don’t exist yet at fan-out.
- Stories need no new database — extend the post model with type and TTL.
- Cassandra native TTL is cleaner than a worker for time-based expiry.
Part of the system design series. Next: designing a ride-sharing system — location-based matching, real-time tracking, and dynamic pricing.