Stateless Servers

The one rule that lets you grow from one server to a thousand without breaking anything.

What "state" means

State is anything a server remembers between requests. Information held in its memory. Things that would be lost if you restarted the server. Things that other servers do not know about.

Here is an example. You add an item to your shopping cart on a website. The server saves the cart in its local memory. The next time you add an item, you hit the same server and it updates that cart. This is a stateful server. It is keeping track of you.

The opposite is a stateless server. A stateless server remembers nothing between requests. Every request you send must carry everything the server needs to handle it. Two requests from the same person can land on totally different servers and both work just fine, because neither server is depending on memory of past requests.

Why memory in the server breaks scaling

Imagine you have a load balancer with three servers behind it. A user adds something to their cart. The load balancer sends that request to Server 2. Server 2 saves the cart in its memory.

A second later the user adds another item. The load balancer rotates and sends this request to Server 1. But Server 1 has never seen this user before. "What cart?" The new item is lost or saved to a different cart. The user is confused.

You can patch this with sticky sessions, which pin each user to a specific server. But now if Server 2 crashes, that user's cart disappears. Deploying new code becomes complicated because you have to drain users away gracefully. And one heavy user can overload one server while the others sit empty.

The real problem is this. Stateful servers cannot be treated as interchangeable. And being interchangeable is the entire point of having many servers.

The fix is to move state somewhere shared

Take the state out of the server's memory. Put it in a separate place that every server can read from and write to.

For shopping carts, that place is a database or a fast in-memory store like Redis. The cart belongs to "user 42," not to "Server 2."

For login sessions, you use a shared session store. That store maps a session ID to who the user is. Any server can look up any session ID and find out who is asking.

Another option is to give the user a small signed token (called a JWT) when they log in. The token itself contains the user info. The server reads it and trusts it because of the signature. No lookup needed.

For uploaded files, you use object storage like AWS S3 instead of saving files on the server's disk. Every server can access every file.

The pattern is always the same. The thing that holds the data is outside the server. The server is now just code that runs. It can be killed and replaced any time without losing anything important.

What you get when your servers are stateless

Once you commit to stateless servers, a lot of useful things become possible.

You can scale horizontally. Traffic doubled overnight? Start more servers. They are identical to the old ones. The load balancer adds them to the pool. Done.

You can deploy without downtime. To release new code, kill the old servers and start new ones with the new code. Users do not notice the swap because there was nothing important to keep on the old ones.

You recover from crashes automatically. A server died? Replace it. Nothing was lost because nothing important was stored there.

You can run in multiple regions. Spin up the same servers in Europe, Asia, the US. Route users to whichever one is closest.

This whole pattern is what powers Kubernetes, AWS Auto Scaling Groups, AWS Lambda, Cloudflare Workers. They all assume the servers running your code are stateless. If your code secretly stores stuff in memory between requests, none of this works.

The right time to design stateless is day one. Refactoring a stateful app to be stateless later is painful.

Now build it yourself →