Estimations and back-of-envelope calculations
In this series (20 parts)
- What is system design and why it matters
- Estimations and back-of-envelope calculations
- Scalability: vertical vs horizontal scaling
- CAP theorem and distributed system tradeoffs
- Consistency models
- Load balancing
- Caching: strategies and patterns
- Content Delivery Networks
- Databases: SQL vs NoSQL and when to use each
- Database replication
- Database sharding and partitioning
- Consistent hashing
- Message queues and event streaming
- API design: REST, GraphQL, gRPC
- Rate limiting and throttling
- Proxies: forward and reverse
- Networking concepts for system design
- Reliability patterns: timeouts, retries, circuit breakers
- Observability: logging, metrics, tracing
- Security in system design
You cannot design a system without sizing it first. Before picking databases, choosing replication strategies, or drawing architecture diagrams, you need to answer basic questions: How much data will we store? How many requests per second will hit the service? How much bandwidth do we need? These numbers drive every downstream decision. Get them wrong by 10x and you pick the wrong architecture entirely.
This article gives you a repeatable framework for back-of-envelope estimation. We will cover the mental math tools, latency intuitions, and worked examples you need to walk into any system design discussion with confidence.
Why estimation matters
Imagine you are designing a URL shortener. Someone asks: “Should we use a relational database or a key-value store?” Without knowing the expected read/write ratio, request volume, and data size, you are guessing. Estimation turns guesswork into reasoning.
The goal is never precision. You want to land within the right order of magnitude. Knowing that your storage need is roughly 10 TB (not 100 GB, not 1 PB) is enough to eliminate entire categories of solutions and focus on what actually fits.
Powers of 2 cheat sheet
Every estimation starts with knowing your units. Memorize this table and you will never fumble a conversion mid-interview or mid-design session.
| Power | Exact value | Approximate | Unit |
|---|---|---|---|
| 2^10 | 1,024 | ~1 thousand | 1 KB |
| 2^20 | 1,048,576 | ~1 million | 1 MB |
| 2^30 | 1,073,741,824 | ~1 billion | 1 GB |
| 2^40 | 1,099,511,627,776 | ~1 trillion | 1 TB |
| 2^50 | — | ~1 quadrillion | 1 PB |
A few handy shortcuts that come up constantly:
- 1 million seconds is about 11.5 days
- 1 billion seconds is about 31.7 years
- There are roughly 86,400 seconds in a day, and about 2.5 million seconds in a month
- For quick math, treat a day as ~100,000 seconds (within 15% accuracy)
These approximations let you convert between “requests per day” and “requests per second” without a calculator.
Latency numbers every engineer should know
Not all operations are equal. Reading 1 MB from memory takes a fundamentally different amount of time than reading 1 MB from disk or sending it across the network. The table below captures the numbers you should internalize. These are approximate and reflect modern hardware (2024 era), but the relative ordering has been stable for decades.
| Operation | Latency |
|---|---|
| L1 cache reference | 0.5 ns |
| Branch mispredict | 5 ns |
| L2 cache reference | 7 ns |
| Mutex lock/unlock | 25 ns |
| Main memory reference | 100 ns |
| Compress 1 KB with Snappy | 3,000 ns (3 µs) |
| Send 1 KB over 1 Gbps network | 10,000 ns (10 µs) |
| Read 4 KB randomly from SSD | 150,000 ns (150 µs) |
| Read 1 MB sequentially from memory | 250,000 ns (250 µs) |
| Round trip within same datacenter | 500,000 ns (500 µs) |
| Read 1 MB sequentially from SSD | 1,000,000 ns (1 ms) |
| HDD disk seek | 10,000,000 ns (10 ms) |
| Read 1 MB sequentially from HDD | 20,000,000 ns (20 ms) |
| Send packet CA to Netherlands to CA | 150,000,000 ns (150 ms) |
The key takeaway: memory is roughly 4x faster than SSD for sequential reads, SSD is roughly 20x faster than HDD, and network round trips dominate everything once you leave the machine.
Latency comparison across storage and network operations. Note the logarithmic scale: each bar represents an order-of-magnitude jump.
These numbers have a direct design implication. If your architecture requires two cross-continent round trips per user request, you have already burned 300 ms before any computation. That is why CDNs exist, why caches sit close to the application, and why database selection matters so much.
The estimation framework
Every back-of-envelope calculation follows the same basic structure. You start with user-facing assumptions, convert them into system-level metrics, and then compute resource requirements.
flowchart TD A[User assumptions] --> B[Traffic estimation] B --> C[QPS calculation] B --> D[Storage estimation] B --> E[Bandwidth estimation] C --> F[Server sizing] D --> G[Database choice] E --> H[Network provisioning] F --> I[Architecture decisions] G --> I H --> I style A fill:#3b82f6,color:#fff style I fill:#10b981,color:#fff
The estimation pipeline: user assumptions flow into traffic, storage, and bandwidth numbers, which drive architecture decisions.
Let us formalize the three core calculations.
QPS (queries per second)
QPS tells you how many requests your system must handle. The formula is simple:
Daily active users (DAU) x average actions per user / seconds in a day = QPS
For peak load, multiply by a peak factor (typically 2x to 5x the average). Systems must be provisioned for peak, not average, or they fall over during traffic surges.
Storage estimation
Storage requires you to estimate the size of each record and the total number of records over time:
Record size x records per day x retention period (days) = total storage
Always account for metadata, indexes, and replication overhead. A good rule of thumb: multiply your raw data estimate by 3x to account for indexes, replicas, and operational headroom.
Bandwidth estimation
Bandwidth is the data flowing in and out of your system per second:
QPS x average request/response size = bandwidth
You need to calculate both ingress (data coming in) and egress (data going out) separately because they often differ dramatically. A system where users upload small text posts but read image-heavy feeds will have much higher egress than ingress.
Worked example: a Twitter-scale system
Let us put everything together with a concrete example. We will estimate the resources needed for a simplified Twitter-like service. This is the kind of problem that shows up in system design discussions, and walking through it methodically demonstrates the estimation process.
Step 1: State your assumptions
Start by writing down your assumptions explicitly. This is critical. In an interview, it shows your interviewer exactly what you are working with. In a real design, it gives your team a document to challenge.
| Assumption | Value |
|---|---|
| Monthly active users (MAU) | 400 million |
| Daily active users (DAU) | 200 million (50% of MAU) |
| Average tweets per user per day | 2 |
| Average tweet size (text) | 280 bytes |
| 10% of tweets include a media attachment | average 500 KB each |
| Average reads per user per day | 100 (timeline loads, searches) |
| Read:write ratio | 50:1 |
| Data retention | 5 years |
Step 2: QPS calculation
Write QPS (new tweets):
200 million users x 2 tweets / 86,400 seconds = ~4,600 tweets/second
Rounding to ~5,000 write QPS for clean math.
Peak write QPS: 5,000 x 3 (peak factor) = ~15,000 QPS
Read QPS (timeline and search):
200 million users x 100 reads / 86,400 seconds = ~230,000 read QPS
Peak read QPS: 230,000 x 3 = ~700,000 QPS
That read QPS is substantial. It tells us immediately that caching is not optional. You cannot serve 700,000 database queries per second without a caching layer. This single number pushes us toward architectures with aggressive caching and possibly pre-computed timelines.
Step 3: Storage estimation
Text storage per day:
200 million users x 2 tweets x 280 bytes = 112 GB/day
Media storage per day:
200 million x 2 x 0.10 (10% with media) x 500 KB = 20 TB/day
Storage over 5 years:
- Text: 112 GB/day x 365 x 5 = ~204 TB
- Media: 20 TB/day x 365 x 5 = ~36.5 PB
With the 3x multiplier for indexes, replicas, and overhead:
- Text: ~612 TB
- Media: ~110 PB
Media dominates storage by two orders of magnitude. This tells you that text and media must live in fundamentally different storage systems. Text goes in a database. Media goes in object storage (like S3) with a CDN in front.
Step 4: Bandwidth estimation
Ingress (writes):
Text: 5,000 QPS x 280 bytes = 1.4 MB/s Media: 5,000 x 0.10 x 500 KB = 250 MB/s
Total ingress: ~250 MB/s
Egress (reads):
Assume each timeline load fetches 20 tweets with 3 media items on average:
230,000 QPS x (20 x 280 bytes + 3 x 500 KB) = ~350 GB/s
That egress number is enormous. It confirms that you need a CDN for media delivery. No origin server cluster can sustain 350 GB/s of outbound traffic. The CDN absorbs the vast majority of media reads, while the origin servers handle text and cache misses.
Step 5: Summarize and sanity check
| Metric | Value |
|---|---|
| Write QPS | ~5,000 (peak ~15,000) |
| Read QPS | ~230,000 (peak ~700,000) |
| Text storage (5 yr) | ~200 TB raw |
| Media storage (5 yr) | ~36 PB raw |
| Ingress bandwidth | ~250 MB/s |
| Egress bandwidth | ~350 GB/s (CDN-heavy) |
Always sanity check by comparing your numbers against known benchmarks. Twitter has reported handling around 6,000 tweets per second on average, which aligns well with our 5,000 estimate. That kind of validation gives you confidence that your assumptions are reasonable.
Common estimation mistakes
Forgetting peak load. Average QPS is meaningless for capacity planning. Your system crashes at peak, not at average. Always multiply by a peak factor of 2x to 5x.
Ignoring metadata overhead. A “280-byte tweet” actually consumes much more when you add user ID, timestamp, geo data, indexes, and storage engine overhead. A realistic per-tweet storage cost is closer to 1 KB.
Assuming linear scaling. Doubling your servers does not double your throughput. Network overhead, coordination costs, and shared resources create diminishing returns. Leave headroom.
Mixing up megabits and megabytes. Network speeds are quoted in bits (Mbps, Gbps). Storage is quoted in bytes. 1 Gbps = 125 MB/s. This factor-of-8 difference has tripped up many engineers mid-calculation.
Quick reference: useful numbers
Keep these in your back pocket for fast estimation during design discussions:
| What | How many |
|---|---|
| Seconds in a day | ~86,400 (~10^5 for quick math) |
| Seconds in a month | ~2.5 million |
| Seconds in a year | ~31.5 million (~π x 10^7) |
| QPS if 1M requests/day | ~12 QPS |
| QPS if 100M requests/day | ~1,200 QPS |
| QPS if 1B requests/day | ~12,000 QPS |
| 1 KB x 1M records | 1 GB |
| 1 MB x 1M records | 1 TB |
| 1 MB x 1B records | 1 PB |
The “π x 10^7 seconds per year” approximation is surprisingly accurate and easy to remember. Use it freely.
Estimation as a communication tool
Back-of-envelope calculations are not just about getting the right number. They are a communication tool. When you walk through an estimation in a design discussion, you are showing your reasoning. You are making your assumptions explicit so others can challenge them. You are demonstrating that your architecture is grounded in reality, not vibes.
The best system designers use estimation to eliminate options quickly. If your storage estimate is 50 PB, you are not using a single MySQL instance. If your read QPS is 500,000, you need caching. If your cross-region latency budget is 200 ms, you need regional deployments. Each number narrows the design space and moves you closer to a solution that actually works.
Practice estimation regularly. Pick a product you use daily, make assumptions about its scale, and run the numbers. Compare your estimates against any public data you can find. Over time, you will develop intuition for what “reasonable” looks like at different scales, and that intuition is what separates engineers who design systems from engineers who just use them.
What comes next
With estimation skills in hand, you are ready to tackle the question that follows naturally: once you know your system needs to handle 700,000 QPS, how do you actually build something that can? That is the topic of Scalability, where we cover horizontal vs vertical scaling, load balancing, and the patterns that let systems grow from thousands to millions of users.