Design a URL shortener
In this series (18 parts)
- Design a URL shortener
- Design a key-value store
- Design a rate limiter
- Design a web crawler
- Design a notification system
- Design a news feed
- Design a chat application
- Design a video streaming platform
- Design a music streaming service
- Design a ride-sharing service
- Design a food delivery platform
- Design a hotel booking platform
- Design a search engine
- Design a distributed message queue
- Design a code deployment system
- Design a payments platform
- Design an ad click aggregation system
- Design a distributed cache
Prerequisites: What is system design? and Back-of-the-envelope estimations.
A URL shortener does two things: it turns a long URL into a short code, and it redirects anyone who visits that code back to the original. Simple in concept, tricky at scale. The interesting problems show up when you need to generate billions of unique short codes without collisions, redirect users in under 10ms, and track every click for analytics.
This case study walks through the full design: requirements, capacity math, architecture, and the deep dives that separate a toy project from a production system.
1. Requirements
Functional requirements
- Shorten: Given a long URL, return a short URL like
https://short.ly/abc123. - Redirect: When a user visits a short URL, redirect them (HTTP 301 or 302) to the original.
- Custom slugs: Users can optionally pick their own short code.
- Expiration: Short URLs expire after a configurable TTL (default: 5 years).
- Analytics: Track click count, referrer, timestamp, and geo for each redirect.
Non-functional requirements
- Low latency: Redirects complete in under 10ms at p99.
- High availability: 99.99% uptime. A broken shortener breaks every link ever shared.
- Scale: 10 million daily active users, 100:1 read-to-write ratio.
- Durability: Once created, a short URL must never lose its mapping.
2. Capacity estimation
Start with the users and work outward.
| Metric | Value |
|---|---|
| DAU | 10 million |
| New URLs per day | 1 million (not every user creates a link) |
| Writes per second | ~12 (1M / 86,400) |
| Reads per second | ~1,200 (100:1 ratio) |
| Peak reads per second | ~6,000 (5x burst) |
| URL record size | ~500 bytes (short code + long URL + metadata) |
| Storage per year | ~180 GB (1M/day * 365 * 500 bytes) |
| Storage over 5 years | ~900 GB |
A single database server can handle 1,200 reads/s comfortably. The bottleneck is not raw throughput but redirect latency. With a caching layer absorbing 90%+ of reads, the database sees roughly 120 reads/s, which is trivial.
3. High-level architecture
graph LR Client["Client"] --> LB["Load Balancer"] LB --> API["API Service"] API --> Cache["Redis Cache"] API --> DB["Database"] API --> IDGen["ID Generator"] API --> Analytics["Analytics Service"] Analytics --> Kafka["Message Queue"] Kafka --> ClickStore["Click Store"]
High-level architecture of a URL shortener. The API service handles both shorten and redirect flows, backed by a cache, database, and async analytics pipeline.
The load balancing layer distributes traffic across stateless API servers. Redirect requests hit the cache first; on a miss they fall through to the database. Analytics events are pushed to a message queue and processed asynchronously so they never slow down the redirect path.
4. API design
Two core endpoints:
POST /api/shorten
{
"long_url": "https://example.com/very/long/path?query=1",
"custom_slug": "my-link", // optional
"ttl_days": 365 // optional
}
Response: { "short_url": "https://short.ly/abc123" }
GET /:short_code
Response: HTTP 302 redirect to the original URL
Why 302 and not 301? A 301 tells the browser to cache the redirect permanently. That is great for performance but it means you lose analytics: the browser will never hit your server again for that link. Use 302 if analytics matter. Use 301 if you want to minimize server load and do not need per-click tracking.
5. Deep dive: short code generation
This is the core problem. You need a function that maps a long URL to a short, unique string. There are three main approaches.
Approach A: Hash and truncate
Hash the long URL with MD5 or SHA-256, then take the first 7 characters (base62 encoded). With base62, 7 characters give you 62^7 = ~3.5 trillion combinations.
The problem: collisions. Two different URLs can produce the same 7-character prefix. You check the database; if the code exists and points to a different URL, append a counter and rehash. This works but adds latency on collision.
Approach B: Counter-based ID
Use a globally unique, monotonically increasing counter. Convert the integer to base62. No collisions by definition.
The problem: predictability. Users can guess the next short code. Also, a single counter is a bottleneck. You solve the first issue by shuffling bits (a simple XOR cipher). You solve the second with a distributed ID generator, or by pre-allocating ranges to each API server.
Approach C: Pre-generated keys
A background service generates random short codes in bulk and stores them in a key pool. When the API needs a new code, it grabs one from the pool. No collision check needed at write time because codes are generated and deduplicated ahead of time.
This is the approach most production systems lean toward. It decouples code generation from the write path and eliminates collision handling entirely.
sequenceDiagram participant Client participant API as API Service participant KP as Key Pool participant DB as Database Client->>API: POST /api/shorten API->>KP: Get next available key KP-->>API: "abc123" API->>DB: INSERT (abc123, long_url, metadata) DB-->>API: OK API-->>Client: short.ly/abc123
Shorten flow using a pre-generated key pool. The API grabs a key, writes the mapping, and returns immediately.
For custom slugs, skip the key pool. Check the database for conflicts, insert if available, and return an error if the slug is taken.
6. Deep dive: redirect flow
Speed is everything here. Every millisecond of redirect latency is felt by every user who clicks a link.
sequenceDiagram
participant Client
participant LB as Load Balancer
participant API as API Service
participant Cache as Redis
participant DB as Database
Client->>LB: GET /abc123
LB->>API: forward request
API->>Cache: GET abc123
alt Cache hit
Cache-->>API: long_url
else Cache miss
API->>DB: SELECT long_url WHERE short_code = abc123
DB-->>API: long_url
API->>Cache: SET abc123 = long_url
end
API-->>Client: 302 Redirect to long_url
API--)Analytics: async click event
Redirect flow. The cache absorbs the majority of reads. Misses fall through to the database and backfill the cache. Analytics are fire-and-forget.
With a warm cache, the p99 redirect latency sits around 2-5ms. The cache-miss path adds a database round trip, pushing it to 10-20ms, but this only happens for cold or rarely accessed links.
Use a TTL on cache entries that matches the link’s expiration. For popular links, the cache entry stays warm naturally. For long-tail links that get one click a year, letting the cache miss is fine.
7. Data model
erDiagram
URL {
string short_code PK
string long_url
string custom_slug
datetime created_at
datetime expires_at
string user_id FK
}
CLICK {
bigint id PK
string short_code FK
datetime clicked_at
string referrer
string user_agent
string country
string ip_hash
}
USER {
string user_id PK
string email
datetime created_at
}
USER ||--o{ URL : creates
URL ||--o{ CLICK : receives
Entity-relationship diagram. The URL table is the core mapping. Clicks are stored separately for analytics.
The URL table is small and read-heavy, which makes it an excellent candidate for a databases overview choice like PostgreSQL with read replicas. The CLICK table grows fast (potentially billions of rows) and is write-heavy. A time-series store or a columnar database like ClickHouse fits better here.
Sharding the URL table
At 900 GB over five years, a single database node can hold the data. But if you need horizontal scaling, shard by short_code. A hash of the short code determines the shard. This is where consistent hashing helps: it lets you add or remove shards without reshuffling the entire keyspace.
8. Handling expiration
Expired links should return HTTP 404, not redirect. Two strategies:
- Lazy deletion: On every read, check
expires_at. If expired, return 404 and optionally delete the record. Simple, but expired keys linger in storage. - Active cleanup: A background job scans for expired records and deletes them in batches. This reclaims storage and keeps the database lean.
Use both. The lazy check catches expired links immediately. The background job cleans up the rest.
9. Analytics pipeline
Every redirect fires an async event containing the short code, timestamp, referrer, user agent, and IP. These events flow into a message queue (Kafka) and are consumed by a processing service that enriches the data (geo lookup from IP, device parsing from user agent) and writes it to the click store.
This pipeline is fully decoupled from the redirect path. If the analytics service goes down, redirects keep working. Events queue up in Kafka and get processed when the service recovers.
For real-time dashboards, maintain a counter in Redis that increments on every redirect. For historical analytics, query the click store directly.
10. Trade-offs and alternatives
| Decision | Option A | Option B | Recommendation |
|---|---|---|---|
| Short code generation | Hash + collision check | Pre-generated key pool | Key pool. Simpler write path, no collision handling. |
| Redirect status | 301 (permanent) | 302 (temporary) | 302 if analytics matter; 301 for pure speed. |
| Database | SQL (PostgreSQL) | NoSQL (DynamoDB) | SQL for URL mappings (small, relational). NoSQL for clicks (high write volume). |
| Cache eviction | TTL-based | LRU | TTL matching link expiration, with LRU as a fallback for memory pressure. |
| Analytics | Synchronous writes | Async via queue | Always async. Never let analytics slow down redirects. |
What about base62 vs base58?
Base62 uses [a-zA-Z0-9]. Base58 removes ambiguous characters (0, O, I, l). If users will type short codes manually, base58 is friendlier. If codes are always copy-pasted or clicked, base62 gives you a slightly larger keyspace.
11. What real systems actually do
Bitly uses a counter-based approach with distributed ID generation. Their system handles tens of billions of links and redirects hundreds of billions of clicks per month. They cache aggressively and use Kafka for analytics.
TinyURL is simpler. It hashes the input URL, checks for collisions, and stores everything in a relational database. At their scale, this works fine.
YouTube uses a similar short-code pattern for video IDs (11-character base64 strings). They pre-generate IDs and assign them at upload time.
The common thread: every production system avoids hash-and-check at write time. They either pre-generate keys or use counters. The write path must be fast and collision-free.
12. What comes next
A URL shortener is a gateway to several deeper system design topics:
- Rate limiting: Prevent abuse by capping URL creation per user per minute. See rate limiting for patterns.
- Spam detection: Run new URLs through a blocklist or ML classifier before accepting them.
- Global distribution: Deploy API servers and caches in multiple regions. Route users to the nearest edge with GeoDNS.
- Link previews: Crawl the destination URL at creation time to extract a title, description, and thumbnail for social sharing.
The core shortener is a weekend project. Making it reliable, fast, and observable at scale is where the real engineering lives.