Event-Driven Architecture

The pattern that lets big companies run hundreds of services without them all melting together.

Two styles of service-to-service communication

Request and response (synchronous). Service A calls Service B over HTTP. A waits for B to reply. If B is down or slow, A is down or slow.

Event-driven. Service A emits an event like "OrderPlaced," "UserSignedUp," or "PaymentFailed" to a queue or pub/sub system. Service B listens for events it cares about and reacts. A never waits for B.

The sync style is simpler but creates tight coupling. A depends on B being up and fast.

The event-driven style separates A from B completely. You add or remove subscribers without touching A. The system survives partial outages. B can be down for an hour and A does not notice. Events queue up. B catches up when it is back.

Modern large systems lean on event-driven for everything that is not critical.

Events vs commands

A small but important difference.

A command says "do this." For example, SendEmail([email protected]). It is a request. The sender expects something to happen.

An event says "this happened." For example, UserSignedUp(user_id=42). It is past tense. The sender does not say what to do. The receivers decide on their own.

Commands are like function calls. The caller controls things. Events are like announcements. The publisher does not care who listens.

Event-driven systems emit events. Subscribers decide what to do with them. The signup service emits "UserSignedUp." The email service decides "okay, I will send a welcome email." The recommendations service decides "okay, I will set up a profile."

The signup service did not tell anyone to do those things. It just said what happened.

Event sourcing. The event log IS the data

You can take event-driven thinking further. Instead of storing the current state of your data, store the sequence of events that produced it.

A bank account is not a row with "balance: $5000." It is a list of events. Deposit $1000. Deposit $4500. Withdraw $500. The current balance is just the result of replaying all of them.

Event sourcing gives you a few wins.

A perfect audit trail. Every change is on record. Time travel. You can rebuild the state at any past moment by replaying events up to that point. Many read models. Different services can build different views from the same stream of events.

It is powerful but it adds complexity. Replaying events for every read is slow. So you build "projections," which are pre-computed snapshots of state from events.

You see it in banking, billing, and anywhere audit and correctness matter more than raw speed.

When event-driven is the wrong choice

Event-driven is not always better. It is wrong when any of these apply.

You need a sync answer. "Did the payment go through" needs an immediate yes or no, not "we will process this and let you know later."

Order matters across many services. If service A emits event 1 and service B emits event 2 and the order matters for service C, eventual delivery can scramble them.

You are a small team without good tools. Debugging "where did this message go" across 20 services without distributed tracing is rough.

The work is tiny. If the downstream call is under 10 ms anyway, adding queue infrastructure is overkill. Just call it directly.

A rule of thumb. Event-driven for cross-service "this happened" notifications. Sync for critical paths where you need an answer now. Most real systems are a mix.

Now build it yourself →