Dec 5, 2025 · 22 min read · System Design

Design a news feed

In this series (18 parts)

A news feed shows a reverse-chronological (or ranked) stream of posts from people you follow. The core challenge is not storing posts; it is delivering the right posts to hundreds of millions of users with low latency. Two competing strategies, fan-out on write and fan-out on read, define the design space. Most production systems use a hybrid.

1. Requirements

Functional

A user publishes a post (text, images, links).
A user views their news feed: an aggregated, ranked list of posts from people they follow.
Feed supports pagination (infinite scroll).
Posts appear in near real-time (under 5 seconds for most users).

Non-functional

Scale: 500 million users, 100 million DAU.
Read-heavy: average user refreshes feed 10 times per day. That is 1 billion feed reads per day, roughly 12,000 QPS with peaks at 30,000 QPS.
Writes: 5 million new posts per day, roughly 60 writes per second on average.
Availability: 99.99% for reads. Feed generation can tolerate slightly lower availability.
Latency: p99 feed read under 200 ms.

2. Capacity estimation

Metric	Value
DAU	100 million
Feed reads per day	1 billion
Feed read QPS (avg / peak)	12,000 / 30,000
New posts per day	5 million
Post write QPS (avg)	~60
Avg post size (metadata + text)	1 KB
Post storage per day	5 GB
Feed cache entry (200 post IDs x 8 bytes)	1.6 KB per user
Feed cache total (100M active users)	~160 GB

The feed cache fits comfortably in a distributed caching cluster. Post storage grows linearly and is sharded by post ID.

3. High-level architecture

graph LR
  Client["Client (Mobile/Web)"]
  LB["Load Balancer"]
  PostSvc["Post Service"]
  FanoutSvc["Fan-out Service"]
  FeedSvc["Feed Service"]
  PostDB["Post DB (sharded)"]
  FeedCache["Feed Cache (Redis)"]
  MQ["Message Queue"]
  RankSvc["Ranking Service"]

  Client -->|publish| LB
  Client -->|read feed| LB
  LB --> PostSvc
  LB --> FeedSvc
  PostSvc -->|write| PostDB
  PostSvc -->|event| MQ
  MQ --> FanoutSvc
  FanoutSvc -->|push post ID| FeedCache
  FeedSvc -->|read| FeedCache
  FeedSvc -->|hydrate| PostDB
  FeedSvc --> RankSvc

High-level architecture of a news feed system. Posts flow through a message queue to fan-out, and feed reads hit cache first.

The write path and read path are decoupled. When a user publishes a post, the Post Service persists it and emits an event to a message queue. The Fan-out Service consumes events and pushes post IDs into each follower’s feed cache. When a user opens their feed, the Feed Service reads cached post IDs, hydrates them from the Post DB, and applies ranking.

4. Deep dive: fan-out strategies

The central design decision is how and when to distribute a post to followers.

Fan-out on write (push model)

When a user publishes a post, the system immediately writes the post ID into every follower’s feed cache. At read time, the feed is already assembled.

Pros: feed reads are fast (single cache lookup). Simple read path.

Cons: a user with 10 million followers triggers 10 million cache writes per post. Write amplification is enormous for celebrities.

Fan-out on read (pull model)

Nothing happens at write time beyond persisting the post. When a user opens their feed, the system fetches recent posts from everyone they follow, merges, and ranks on the fly.

Pros: no write amplification. Celebrity posts cost nothing extra at write time.

Cons: feed reads are slow. If you follow 500 people, the system must query 500 post lists and merge them. Read latency suffers.

Hybrid approach

Most production systems combine both. For normal users (fewer than, say, 10,000 followers), use fan-out on write. For celebrities, skip fan-out at write time. At read time, merge the precomputed feed with recent posts from followed celebrities.

This limits write amplification while keeping reads fast for the common case. The celebrity threshold is tunable.

The hybrid approach balances read and write latency. Read latency is slightly higher than pure push because celebrity posts are merged at read time, but write costs stay manageable.

5. Deep dive: post publish flow

sequenceDiagram
  participant U as User
  participant PS as Post Service
  participant DB as Post DB
  participant MQ as Message Queue
  participant FO as Fan-out Service
  participant FC as Feed Cache

  U->>PS: Create post
  PS->>DB: Persist post
  PS->>MQ: Publish event
  MQ->>FO: Deliver event
  FO->>FO: Fetch follower list
  FO->>FO: Filter celebrities
  FO->>FC: Push post ID to each follower feed
  FC-->>FO: ACK

Sequence diagram for the post publish flow. The fan-out service skips celebrity fan-out to avoid write amplification.

Key details:

Follower list lookup: the Fan-out Service fetches the author’s follower list from a social graph service. This list is cached aggressively since it changes infrequently.
Celebrity detection: if the author has more than a configured threshold of followers (e.g., 10,000), the fan-out service skips the push. The post is still persisted and will be pulled at read time.
Batching: fan-out writes to the cache are batched. Writing 5,000 cache entries one at a time is slow; batching with pipeline commands in Redis reduces round trips.
Ordering: each feed cache entry is a sorted set keyed by timestamp (or a monotonic ID). New post IDs are inserted with their creation time as the score.

6. Deep dive: feed read flow

sequenceDiagram
  participant U as User
  participant FS as Feed Service
  participant FC as Feed Cache
  participant DB as Post DB
  participant RS as Ranking Service

  U->>FS: Get feed (cursor, page size)
  FS->>FC: Fetch post IDs (top N by score)
  FC-->>FS: Post ID list
  FS->>DB: Batch fetch posts by ID
  DB-->>FS: Post objects
  FS->>FS: Merge celebrity posts
  FS->>RS: Rank merged posts
  RS-->>FS: Ranked list
  FS-->>U: Feed page

Sequence diagram for reading the feed. Cached post IDs are hydrated from the database, merged with celebrity posts, and ranked.

The read path has several optimizations:

Cursor-based pagination: the client sends the last seen post ID (or timestamp) as a cursor. The cache returns the next page of IDs after that cursor. This avoids offset-based pagination problems.
Hydration batching: post IDs are fetched from the cache, then hydrated in a single batch query to the Post DB. This keeps round trips constant regardless of page size.
Celebrity merge: for each celebrity the user follows, the Feed Service fetches their recent posts (from a per-user cache or the Post DB) and merges them into the feed before ranking.
Ranking: a lightweight model scores posts based on recency, engagement signals (likes, comments), and user affinity. The Ranking Service returns a reordered list. For many systems, a simple weighted formula works well enough.

7. Data storage and sharding

Posts are stored in a relational or wide-column database, sharded by post ID using consistent hashing. Database sharding by post ID distributes load evenly since post IDs are generated with a random component.

The social graph (who follows whom) is stored separately, often in a graph database or a sharded key-value store keyed by user ID. Follower lists are read-heavy and cached. A typical schema stores edges as (follower_id, followee_id) pairs. To fetch “all followers of user X,” you query on followee_id. To fetch “all users X follows,” you query on follower_id. Both directions need fast lookups, so the table is often double-written or uses secondary indexes.

Feed caches use Redis sorted sets. Each user’s feed is a sorted set where the score is the post creation timestamp. The cache is keyed by user ID. With 100 million active users and 1.6 KB per entry, the total cache size is around 160 GB, easily handled by a Redis cluster.

A few storage considerations worth noting:

Cache eviction: feed caches store only the most recent 200-500 post IDs. Older entries are evicted. If a user scrolls far back, the system falls back to a database query.
Cache warming: when a new user signs up or a cache entry expires, the system must rebuild the feed from scratch. This is a cold-start problem. Pre-warming caches for active users during off-peak hours helps.
Write-ahead log: the message queue acts as a durable write-ahead log for fan-out. If the Fan-out Service crashes mid-way through distributing a post, it replays from the queue offset. This guarantees at-least-once delivery.

8. Trade-offs and alternatives

Decision	Option A	Option B	Recommendation
Fan-out strategy	Push (write)	Pull (read)	Hybrid with celebrity threshold
Feed cache	Redis sorted sets	Cassandra timelines	Redis for latency; Cassandra if durability matters more
Ranking	Chronological	ML-ranked	Start chronological, add lightweight ranking later
Pagination	Offset-based	Cursor-based	Cursor-based to avoid duplicates on active feeds
Post storage	SQL (PostgreSQL)	NoSQL (Cassandra)	Either works; Cassandra scales horizontally with less effort

The hybrid fan-out strategy is the dominant pattern because pure push collapses under celebrity load and pure pull is too slow for reads.

9. What real systems actually do

Twitter (X) pioneered the hybrid fan-out model. Tweets from users with fewer than a threshold of followers are fanned out on write. Celebrity tweets are merged at read time. Twitter’s early architecture was pure fan-out on write, and Lady Gaga’s tweets famously caused latency spikes.

Facebook uses a ranked feed with a complex ML pipeline. The initial candidate generation uses a fan-out on write approach, but the ranking layer reorders aggressively. Facebook’s feed is not chronological; the ranking model optimizes for engagement.

Instagram moved from chronological to ranked feeds. The underlying storage uses Cassandra for timeline data, and the fan-out service handles celebrity accounts differently.

All three systems rely heavily on caching, message queues for async fan-out, and separate ranking services. The hybrid fan-out pattern is standard.

10. What comes next

This design covers the core feed pipeline. In production, you would also need:

Notification service: alert users about high-priority posts (mentions, close friends).
Content moderation: filter posts before they enter the fan-out pipeline.
Analytics pipeline: track impressions, clicks, and engagement for ranking model training.
Media storage: images and videos need a separate CDN-backed storage layer. Feed entries only store references.
Rate limiting: protect the write path from spam and abuse.

Each of these is a system design problem on its own. The feed architecture provides the backbone, and these services plug into the event pipeline through the message queue.

Key takeaways

Fan-out strategy defines the architecture. Pure push works for small-scale systems. At scale, the hybrid approach is necessary to handle celebrity accounts without exploding write costs.
Decouple write and read paths. Async fan-out through a message queue keeps publish latency low and lets you scale fan-out workers independently.
Cache is the critical layer. Feed reads must be fast. A well-sized Redis cluster with sorted sets gives you sub-10ms lookups for precomputed feeds.
Ranking is incremental. Start with reverse-chronological ordering. Layer in lightweight ranking (recency + engagement) before investing in full ML pipelines.
Design for the celebrity edge case from day one. It is not an optimization; it is a correctness requirement at scale. A single celebrity post should not degrade the system for millions of users.

← Back to all series