Feb 24, 2026 · 16 min read · System Design

Estimations and back-of-envelope calculations

In this series (20 parts)

You cannot design a system without sizing it first. Before picking databases, choosing replication strategies, or drawing architecture diagrams, you need to answer basic questions: How much data will we store? How many requests per second will hit the service? How much bandwidth do we need? These numbers drive every downstream decision. Get them wrong by 10x and you pick the wrong architecture entirely.

This article gives you a repeatable framework for back-of-envelope estimation. We will cover the mental math tools, latency intuitions, and worked examples you need to walk into any system design discussion with confidence.

Why estimation matters

Imagine you are designing a URL shortener. Someone asks: “Should we use a relational database or a key-value store?” Without knowing the expected read/write ratio, request volume, and data size, you are guessing. Estimation turns guesswork into reasoning.

The goal is never precision. You want to land within the right order of magnitude. Knowing that your storage need is roughly 10 TB (not 100 GB, not 1 PB) is enough to eliminate entire categories of solutions and focus on what actually fits.

Powers of 2 cheat sheet

Every estimation starts with knowing your units. Memorize this table and you will never fumble a conversion mid-interview or mid-design session.

Power	Exact value	Approximate	Unit
2^10	1,024	~1 thousand	1 KB
2^20	1,048,576	~1 million	1 MB
2^30	1,073,741,824	~1 billion	1 GB
2^40	1,099,511,627,776	~1 trillion	1 TB
2^50	—	~1 quadrillion	1 PB

A few handy shortcuts that come up constantly:

1 million seconds is about 11.5 days
1 billion seconds is about 31.7 years
There are roughly 86,400 seconds in a day, and about 2.5 million seconds in a month
For quick math, treat a day as ~100,000 seconds (within 15% accuracy)

These approximations let you convert between “requests per day” and “requests per second” without a calculator.

Latency numbers every engineer should know

Not all operations are equal. Reading 1 MB from memory takes a fundamentally different amount of time than reading 1 MB from disk or sending it across the network. The table below captures the numbers you should internalize. These are approximate and reflect modern hardware (2024 era), but the relative ordering has been stable for decades.

Operation	Latency
L1 cache reference	0.5 ns
Branch mispredict	5 ns
L2 cache reference	7 ns
Mutex lock/unlock	25 ns
Main memory reference	100 ns
Compress 1 KB with Snappy	3,000 ns (3 µs)
Send 1 KB over 1 Gbps network	10,000 ns (10 µs)
Read 4 KB randomly from SSD	150,000 ns (150 µs)
Read 1 MB sequentially from memory	250,000 ns (250 µs)
Round trip within same datacenter	500,000 ns (500 µs)
Read 1 MB sequentially from SSD	1,000,000 ns (1 ms)
HDD disk seek	10,000,000 ns (10 ms)
Read 1 MB sequentially from HDD	20,000,000 ns (20 ms)
Send packet CA to Netherlands to CA	150,000,000 ns (150 ms)

The key takeaway: memory is roughly 4x faster than SSD for sequential reads, SSD is roughly 20x faster than HDD, and network round trips dominate everything once you leave the machine.

Latency comparison across storage and network operations. Note the logarithmic scale: each bar represents an order-of-magnitude jump.

These numbers have a direct design implication. If your architecture requires two cross-continent round trips per user request, you have already burned 300 ms before any computation. That is why CDNs exist, why caches sit close to the application, and why database selection matters so much.

The estimation framework

Every back-of-envelope calculation follows the same basic structure. You start with user-facing assumptions, convert them into system-level metrics, and then compute resource requirements.

flowchart TD
  A[User assumptions] --> B[Traffic estimation]
  B --> C[QPS calculation]
  B --> D[Storage estimation]
  B --> E[Bandwidth estimation]
  C --> F[Server sizing]
  D --> G[Database choice]
  E --> H[Network provisioning]
  F --> I[Architecture decisions]
  G --> I
  H --> I
  style A fill:#3b82f6,color:#fff
  style I fill:#10b981,color:#fff

The estimation pipeline: user assumptions flow into traffic, storage, and bandwidth numbers, which drive architecture decisions.

Let us formalize the three core calculations.

QPS (queries per second)

QPS tells you how many requests your system must handle. The formula is simple:

Daily active users (DAU) x average actions per user / seconds in a day = QPS

For peak load, multiply by a peak factor (typically 2x to 5x the average). Systems must be provisioned for peak, not average, or they fall over during traffic surges.

Storage estimation

Storage requires you to estimate the size of each record and the total number of records over time:

Record size x records per day x retention period (days) = total storage

Always account for metadata, indexes, and replication overhead. A good rule of thumb: multiply your raw data estimate by 3x to account for indexes, replicas, and operational headroom.

Bandwidth estimation

Bandwidth is the data flowing in and out of your system per second:

QPS x average request/response size = bandwidth

You need to calculate both ingress (data coming in) and egress (data going out) separately because they often differ dramatically. A system where users upload small text posts but read image-heavy feeds will have much higher egress than ingress.

Worked example: a Twitter-scale system

Let us put everything together with a concrete example. We will estimate the resources needed for a simplified Twitter-like service. This is the kind of problem that shows up in system design discussions, and walking through it methodically demonstrates the estimation process.

Step 1: State your assumptions

Start by writing down your assumptions explicitly. This is critical. In an interview, it shows your interviewer exactly what you are working with. In a real design, it gives your team a document to challenge.

Assumption	Value
Monthly active users (MAU)	400 million
Daily active users (DAU)	200 million (50% of MAU)
Average tweets per user per day	2
Average tweet size (text)	280 bytes
10% of tweets include a media attachment	average 500 KB each
Average reads per user per day	100 (timeline loads, searches)
Read:write ratio	50:1
Data retention	5 years

Step 2: QPS calculation

Write QPS (new tweets):

200 million users x 2 tweets / 86,400 seconds = ~4,600 tweets/second

Rounding to ~5,000 write QPS for clean math.

Peak write QPS: 5,000 x 3 (peak factor) = ~15,000 QPS

Read QPS (timeline and search):

200 million users x 100 reads / 86,400 seconds = ~230,000 read QPS

Peak read QPS: 230,000 x 3 = ~700,000 QPS

That read QPS is substantial. It tells us immediately that caching is not optional. You cannot serve 700,000 database queries per second without a caching layer. This single number pushes us toward architectures with aggressive caching and possibly pre-computed timelines.

Step 3: Storage estimation

Text storage per day:

200 million users x 2 tweets x 280 bytes = 112 GB/day

Media storage per day:

200 million x 2 x 0.10 (10% with media) x 500 KB = 20 TB/day

Storage over 5 years:

Text: 112 GB/day x 365 x 5 = ~204 TB
Media: 20 TB/day x 365 x 5 = ~36.5 PB

With the 3x multiplier for indexes, replicas, and overhead:

Text: ~612 TB
Media: ~110 PB

Media dominates storage by two orders of magnitude. This tells you that text and media must live in fundamentally different storage systems. Text goes in a database. Media goes in object storage (like S3) with a CDN in front.

Step 4: Bandwidth estimation

Ingress (writes):

Text: 5,000 QPS x 280 bytes = 1.4 MB/s Media: 5,000 x 0.10 x 500 KB = 250 MB/s

Total ingress: ~250 MB/s

Egress (reads):

Assume each timeline load fetches 20 tweets with 3 media items on average:

230,000 QPS x (20 x 280 bytes + 3 x 500 KB) = ~350 GB/s

That egress number is enormous. It confirms that you need a CDN for media delivery. No origin server cluster can sustain 350 GB/s of outbound traffic. The CDN absorbs the vast majority of media reads, while the origin servers handle text and cache misses.

Step 5: Summarize and sanity check

Metric	Value
Write QPS	~5,000 (peak ~15,000)
Read QPS	~230,000 (peak ~700,000)
Text storage (5 yr)	~200 TB raw
Media storage (5 yr)	~36 PB raw
Ingress bandwidth	~250 MB/s
Egress bandwidth	~350 GB/s (CDN-heavy)

Always sanity check by comparing your numbers against known benchmarks. Twitter has reported handling around 6,000 tweets per second on average, which aligns well with our 5,000 estimate. That kind of validation gives you confidence that your assumptions are reasonable.

Common estimation mistakes

Forgetting peak load. Average QPS is meaningless for capacity planning. Your system crashes at peak, not at average. Always multiply by a peak factor of 2x to 5x.

Ignoring metadata overhead. A “280-byte tweet” actually consumes much more when you add user ID, timestamp, geo data, indexes, and storage engine overhead. A realistic per-tweet storage cost is closer to 1 KB.

Assuming linear scaling. Doubling your servers does not double your throughput. Network overhead, coordination costs, and shared resources create diminishing returns. Leave headroom.

Mixing up megabits and megabytes. Network speeds are quoted in bits (Mbps, Gbps). Storage is quoted in bytes. 1 Gbps = 125 MB/s. This factor-of-8 difference has tripped up many engineers mid-calculation.

Quick reference: useful numbers

Keep these in your back pocket for fast estimation during design discussions:

What	How many
Seconds in a day	~86,400 (~10^5 for quick math)
Seconds in a month	~2.5 million
Seconds in a year	~31.5 million (~π x 10^7)
QPS if 1M requests/day	~12 QPS
QPS if 100M requests/day	~1,200 QPS
QPS if 1B requests/day	~12,000 QPS
1 KB x 1M records	1 GB
1 MB x 1M records	1 TB
1 MB x 1B records	1 PB

The “π x 10^7 seconds per year” approximation is surprisingly accurate and easy to remember. Use it freely.

Estimation as a communication tool

Back-of-envelope calculations are not just about getting the right number. They are a communication tool. When you walk through an estimation in a design discussion, you are showing your reasoning. You are making your assumptions explicit so others can challenge them. You are demonstrating that your architecture is grounded in reality, not vibes.

The best system designers use estimation to eliminate options quickly. If your storage estimate is 50 PB, you are not using a single MySQL instance. If your read QPS is 500,000, you need caching. If your cross-region latency budget is 200 ms, you need regional deployments. Each number narrows the design space and moves you closer to a solution that actually works.

Practice estimation regularly. Pick a product you use daily, make assumptions about its scale, and run the numbers. Compare your estimates against any public data you can find. Over time, you will develop intuition for what “reasonable” looks like at different scales, and that intuition is what separates engineers who design systems from engineers who just use them.

What comes next

With estimation skills in hand, you are ready to tackle the question that follows naturally: once you know your system needs to handle 700,000 QPS, how do you actually build something that can? That is the topic of Scalability, where we cover horizontal vs vertical scaling, load balancing, and the patterns that let systems grow from thousands to millions of users.

← Back to all series