How to Crack a System Design Interview

You know the building blocks now. This is how you assemble them under pressure, out loud, in 45 minutes.

The interview is a conversation, not a quiz

Here is the thing nobody tells beginners. A system design interview is not a test of whether you can recite what a load balancer is. You already learned that. It is a test of whether you can take a vague, open-ended prompt, "Design Twitter," "Design a URL shortener," "Design Uber", and drive a structured conversation toward a working system.

The interviewer is your teammate, not your examiner. They want to see how you think. They will interrupt. They will push back. They will say "what if traffic grows 10x?" The whole point is to watch you make decisions and defend them.

That means the worst thing you can do is go silent and start drawing boxes. The second worst thing is to jump straight to the answer you memorized. There is a method, and it is the same every single time. Six steps. Learn them once, apply them to any prompt.

Step 1: Pin down what you are actually building

Never start designing. Start asking. The prompt "Design Twitter" is deliberately under-specified, and the interviewer is waiting to see if you notice.

Nail down two kinds of requirements. Functional requirements are what the system does: "users post tweets," "users see a timeline." Pick the 2 or 3 that matter and explicitly cut the rest, "I'll skip DMs and ads for now, is that OK?" Scoping down is a senior move, not a cop-out.

Non-functional requirements are the qualities that decide the architecture: How many users? Is it read-heavy or write-heavy? How fast must it feel? Can it ever show slightly stale data, or must it be perfectly consistent? These answers are what separate a real design from a generic one. A read-heavy system gets caches and replicas. A write-heavy one gets queues and sharding. You cannot choose until you ask.

Step 2: Do the napkin math

Now put numbers on it. This is the step beginners skip and seniors never do. You do not need to be exact, you need to find the order of magnitude that breaks the naive design.

A quick back-of-envelope: say 100 million daily users, each making 10 reads a day. That is a billion reads a day, which is about 12,000 reads per second on average, and traffic is spiky, so peak might be 3x that. One database does ~30 reads per second. So you already know, before drawing anything, that you will need caching and many servers. The math told you.

Estimate three things: traffic (requests per second), storage (how much data per year), and bandwidth if media is involved. The numbers are not the point, the decisions they force are the point. "12,000 reads per second means a single database is impossible" is exactly the sentence interviewers want to hear.

Step 3: Define the API and the data model

Before boxes and arrows, agree on the contract. What are the actual operations? For a URL shortener it is two: POST /urls takes a long URL and returns a short code, and GET /{code} redirects. That is the entire API. Writing it down forces clarity about what the system even does.

Then sketch the data model. What are you storing, and what shape is it? A short code maps to a long URL, that is a simple key-value lookup, which tells you a key-value store or a single indexed table is plenty. A social graph of who-follows-whom is a very different shape and pushes you toward different storage.

This step is short, but it anchors everything after it. The API tells you the read/write paths. The data model tells you which database and which access patterns you must make fast.

Step 4: Draw the boxes and arrows

Now you draw, and because you did steps 1 to 3, the drawing almost writes itself. Start with the simplest thing that satisfies the requirements: client, server, database. Make it work end to end first.

Then evolve it under the numbers you computed. "12,000 reads per second can't hit one server, so I'll put a load balancer in front and run several stateless app servers." Each box you add should answer a specific pressure you already identified, not appear because you memorized a diagram.

Talk the whole time. "Traffic comes into the load balancer, which spreads it across app servers. Each server is stateless, so any server can handle any request." You are narrating a system into existence. The interviewer is following your logic, and every box is justified by a number from step 2.

Step 5: Find the bottleneck and go deep

A high-level diagram is never the finish line. The interviewer will point at one box and say "tell me more." Usually it is the part that breaks first, and your step-2 math already told you which one.

For a read-heavy system, the database is almost always the bottleneck. 12,000 reads per second against a 30-per-second database is a fire. So you go deep there: "Most reads are for the same popular items, so I'll put a cache in front of the database. At a 90% hit rate, the database only sees 1,200 reads per second instead of 12,000." Then you discuss the hard parts of that choice, invalidation, what happens on a cache miss, the thundering-herd problem.

This is where all 40 concepts you learned pay off. Replication for read scaling. Sharding when one machine is not enough. Queues to absorb write spikes. You are not adding them for show, you are reaching for the right tool because you found the specific bottleneck it solves.

Step 6: There is no perfect design, only tradeoffs

The final move is to show judgment. Every real decision costs you something, and senior engineers say the cost out loud before the interviewer has to ask.

Added a cache? Then reads can be stale for a few seconds, you traded consistency for speed. Added read replicas? Same trade, plus replication lag. Chose NoSQL? You gave up multi-row transactions for easier scaling. There is no design with zero downsides. The signal you are sending is: "I know what I gave up, and here is why it is the right call for these requirements."

Close by naming what you would do next with more time: "To handle 10x growth I would shard the database by user ID. To survive a region outage I would run active-active across two regions." You do not have to build it. Naming it proves you see the road ahead.

That is the whole method. Six steps, every time: clarify, estimate, API and data model, high-level design, deep-dive the bottleneck, talk tradeoffs. Next, you will run it on a real prompt, and then build the answer yourself.