Design a URL Shortener

"Design a service like bit.ly." A classic opener. Let us run the 6-step method on it, then you build the read path yourself.

Step 1: Scope the prompt

"Design a URL shortener like bit.ly." First, do not design, clarify.

Functional requirements: a user gives a long URL and gets back a short code (bit.ly/abc123). When anyone visits that short link, they get redirected to the original URL. We will skip custom aliases, expiry, and analytics for now, call that out loud so the interviewer knows it is a choice, not an oversight.

Non-functional requirements: here is the one that shapes everything. A link is created once but clicked thousands of times. This system is overwhelmingly read-heavy, easily 100 reads for every write. It must redirect fast (low latency), and it is fine if a brand-new link takes a second to become visible everywhere (we can tolerate slight staleness). Read-heavy + latency-sensitive + staleness-tolerant is the exact profile that screams "cache."

Step 2: Napkin math

Put numbers on it. Say we create 100 million new URLs per month. That is about 40 writes per second on average, small.

But reads dominate. At 100:1, that is 4,000 reads per second on average, and peak traffic is spikier still. Compare that to a single database, which handles around 30 reads per second comfortably. The gap is enormous. The math just told you the entire shape of the answer: one database cannot serve the reads, so you need many app servers behind a load balancer, and a cache to keep most reads off the database.

Storage: 100M URLs a month × maybe 500 bytes each is ~50 GB a month, ~600 GB a year. That fits comfortably on a single database for years, so we do not need to shard for storage. The pressure is read throughput, not data size. Knowing which problem you have keeps you from over-engineering.

Step 3: API and data model

The API is tiny, two endpoints:

POST /urls with a long URL in the body, returns the short code. GET /{code} looks up the code and responds with a redirect to the long URL. That is the whole contract.

The data model is just as simple: a mapping from short code → long URL. That is a key-value lookup. We can store it in a single indexed table or a key-value store; either is fine because the access pattern is "given a code, find one row." No joins, no complex queries.

How do you generate the short code? Two common answers: hash the URL and take the first few characters (watch for collisions), or take an auto-incrementing number and encode it in base62 (a–z, A–Z, 0–9). Base62 with 7 characters gives 62^7 ≈ 3.5 trillion codes, plenty. Mention the choice and the tradeoff; you do not need to implement it.

Step 4: The high-level read path

Now draw it, starting simple and evolving under the numbers. The naive version is user → server → database. It works, but step 2 told us 4,000+ reads per second will flatten a single server (40 r/s) and a single database (30 r/s).

So evolve it. Put a load balancer in front and run several stateless app servers behind it. Because the servers hold no per-user state, every redirect is just a code lookup, any server can handle any request, and the load balancer can spread traffic freely. That solves the app tier.

But every one of those reads still lands on the database. Even spread across servers, 4,000 reads per second against a 30-per-second database is a guaranteed meltdown. The app tier scales; the data tier does not. That is the bottleneck, and it is exactly what step 5 is for.

Step 5: Cache the hot links

Here is the deep-dive, and it falls straight out of step 1: this system is read-heavy, and most reads are for the same popular links. A handful of viral URLs get the overwhelming majority of clicks. That is the perfect case for a cache.

Put a cache (Redis or Memcached) between the app servers and the database. On a redirect, a server checks the cache first. Hit? Return the URL instantly, database never touched. Miss? Read from the database, store it in the cache, and the next thousand requests for that link are hits.

Do the math: at a 90% hit rate, the database sees one tenth of the reads. 4,000 reads per second becomes 400, and with hot links it is even better. A box that was on fire is now bored. This is the whole game of read-heavy design: keep the slow, durable store cold by catching repeat reads in a fast layer in front of it. The tradeoff, as always: a freshly changed link can be stale until the cache entry expires, which we already decided we can tolerate.

Step 6: Tradeoffs, then build it

Close like a senior. Name what you traded and what you would do next.

Tradeoffs: the cache buys huge read throughput at the cost of slight staleness, fine here. The write path is small (40/s), so a single primary database is enough; no sharding needed for years, and you said why (storage is tiny, the pressure is reads). Code generation via base62 over an auto-increment ID avoids collisions but needs a single source of IDs, a tradeoff worth naming.

What is next with more time: read replicas if the 10% of misses still strains the primary; a CDN or geo-distributed caches so redirects are fast worldwide; analytics on clicks via an async queue so counting never slows the redirect.

That is a complete, defensible answer to a real interview question, and you drove it with the same six steps every time. Now stop reading and build the read path yourself. The user is sending 150 r/s. Servers handle 40 r/s, the database only 30. Assemble a system that serves all of it without melting the database.

Now build it yourself →