Message Queue

The simplest async building block. Also one of the most powerful patterns in distributed systems.

The problem with doing it all up front

Your web server gets 80 requests per second. For each one, it has to send a confirmation email. That is a 2 second call to an external email service that blocks until it returns.

If the email API has a bad day and takes 5 seconds instead of 2, your web server's threads pile up waiting. Latency for users shoots through the roof. If the email service goes down, your whole site goes with it.

The user does not care about the email being sent at this exact moment. They care about getting the success page. Trying to do both at once is the bug.

Put a queue between them

A message queue is a service that takes messages, holds them safely, and lets workers pull them off later.

The web server's job is now small. Drop a message into the queue. The message says something like "send email to [email protected] about order #123." Then return success to the user. That takes about 1 ms.

A separate pool of workers reads from the queue, sends the emails, and retries on failure. The user has already moved on.

The queue separates the sender from the worker. They can scale on their own, fail on their own, and deploy on their own.

Three things you get from a queue, for free

1. Spike absorption. Traffic doubles for 5 minutes? The queue grows by a few thousand messages. The workers chew through them at their normal rate. Users feel nothing.

2. Failure isolation. The email service goes down? Messages sit in the queue. Workers retry. When the service comes back, the queue drains. Users never see an error.

3. Independent scaling. Need to send 10 times more emails? Add 10 times more workers. The web tier does not change.

This is why queues are everywhere. Emails. Push notifications. Video transcoding. Image resizing. Billing runs. Analytics ingestion. ML inference jobs. Anything you can do after the user clicks save.

Promises and gotchas

Different queues offer different promises.

At-least-once delivery is the most common. Every message is processed at least once. Maybe more if a worker crashes in the middle. Your workers must be idempotent. Processing the same message twice has to not cause harm.

Exactly-once delivery means each message is processed exactly one time. It is hard to do. Some queues like Kafka with transactions offer it under specific conditions. Most do not.

Ordering. Are messages processed in the order they were sent? Single-partition queues, yes. Multi-partition queues like Kafka or standard SQS, only inside one partition, not across all of them.

Durability. Do messages survive a crash? Real production queues like RabbitMQ, SQS, and Kafka, yes. Cache-only queues like Redis lists, only as durable as Redis itself.

Pick the queue that matches your needs. SQS is managed and simple. RabbitMQ is self-hosted and flexible. Kafka is high-throughput with ordering and replay.

Now build it yourself →