Microservice communication patterns
In this series (12 parts)
- Monolith vs microservices
- Microservice communication patterns
- Service discovery and registration
- Event-driven architecture
- Distributed data patterns
- Caching architecture patterns
- Search architecture
- Storage systems at scale
- Notification systems
- Real-time systems architecture
- Batch and stream processing
- Multi-region and global systems
Once you split a monolith into services, every function call that crossed a module boundary becomes a network call. That single change introduces latency, partial failure, and the need for explicit contracts between teams. The communication pattern you choose shapes your system’s reliability, coupling, and operational complexity more than almost any other architectural decision.
Synchronous: request-response
The simplest pattern is a direct HTTP or gRPC call. Service A sends a request to Service B and blocks until it gets a response. This feels natural because it mirrors how functions work inside a monolith, but the similarity is deceptive.
sequenceDiagram participant Client participant OrderSvc as Order Service participant InvSvc as Inventory Service participant PaySvc as Payment Service Client->>OrderSvc: POST /orders OrderSvc->>InvSvc: Check stock InvSvc-->>OrderSvc: Stock available OrderSvc->>PaySvc: Charge card PaySvc-->>OrderSvc: Payment confirmed OrderSvc-->>Client: Order created
Synchronous chain: the client waits while the order service calls inventory and payment sequentially.
The total latency is the sum of all downstream calls plus network overhead. If the inventory check takes 50ms and the payment call takes 200ms, the client sees at least 250ms plus serialization and network time. Worse, if any service in the chain is down, the entire request fails. This temporal coupling means your system’s availability is the product of each service’s availability. Three services at 99.5% uptime give you 98.5% combined.
Synchronous communication works well for queries where the client genuinely needs the response before proceeding. It becomes problematic for commands where you are triggering side effects across multiple services.
Asynchronous: event-driven
The alternative is to decouple services through message queues. Instead of calling a service directly, you publish an event to a broker. Interested services consume events at their own pace.
graph LR OS["Order Service"] -->|"OrderPlaced"| MQ["Message Broker"] MQ -->|"consume"| IS["Inventory Service"] MQ -->|"consume"| PS["Payment Service"] MQ -->|"consume"| NS["Notification Service"] IS -->|"StockReserved"| MQ PS -->|"PaymentProcessed"| MQ
Event-driven: the order service publishes an event and moves on. Consumers process independently.
This eliminates temporal coupling. The order service does not need the inventory or payment service to be running at the moment the order is placed. Events queue up in the broker and get processed when the consumer is ready. Adding a new consumer (say, an analytics service) requires zero changes to the producer.
The tradeoff is that you lose the immediate feedback loop. The client gets an acknowledgment that the order was accepted, but the actual processing happens asynchronously. You need to design for eventual consistency and provide mechanisms for the client to check status later, through polling, webhooks, or server-sent events.
The saga pattern for distributed transactions
In a monolith, you wrap multiple operations in a database transaction: debit the account, create the order, reserve inventory, all or nothing. In a distributed system, you cannot use a single transaction across services. Each service owns its own database, and two-phase commit (2PC) is fragile and slow at scale.
The saga pattern replaces a single transaction with a sequence of local transactions, each with a compensating action that undoes its effect if a later step fails. There are two coordination approaches: choreography and orchestration.
Choreography
In choreography, each service listens for events and decides what to do next. There is no central coordinator. The flow emerges from the event chain.
sequenceDiagram participant OS as Order Service participant IS as Inventory Service participant PS as Payment Service participant NS as Notification Service OS->>OS: Create order (PENDING) OS-)IS: OrderCreated event IS->>IS: Reserve stock IS-)PS: StockReserved event PS->>PS: Process payment PS-)NS: PaymentProcessed event PS-)OS: PaymentProcessed event NS->>NS: Send confirmation OS->>OS: Update order (CONFIRMED) Note over PS,OS: If payment fails... PS-)IS: PaymentFailed event IS->>IS: Release stock (compensate) IS-)OS: StockReleased event OS->>OS: Update order (CANCELLED)
Choreographed saga: services react to events and publish their own. Compensation flows backward on failure.
Choreography works well for simple flows with few steps. It avoids a single point of failure since there is no orchestrator to go down. The downside is that the business logic is spread across multiple services, making it hard to understand the full flow by reading any single codebase.
Orchestration
In orchestration, a dedicated saga orchestrator tells each service what to do and handles the compensation logic centrally.
sequenceDiagram participant Orch as Order Saga Orchestrator participant IS as Inventory Service participant PS as Payment Service participant NS as Notification Service Orch->>IS: Reserve stock IS-->>Orch: Stock reserved Orch->>PS: Process payment PS-->>Orch: Payment processed Orch->>NS: Send confirmation NS-->>Orch: Confirmation sent Note over Orch,PS: If payment fails... Orch->>IS: Release stock (compensate) IS-->>Orch: Stock released Orch->>Orch: Mark order cancelled
Orchestrated saga: a central coordinator drives the flow and handles compensation.
Orchestration makes the business flow explicit and testable in one place. The orchestrator is a state machine that knows every step and every compensating action. The cost is a central component that becomes a coordination bottleneck and a single point of failure if not designed for reliability.
Most production systems use orchestration for complex flows (five or more steps) and choreography for simple, two-to-three-step interactions.
Service mesh
As the number of services grows, cross-cutting concerns like retries, timeouts, mutual TLS, and traffic shaping become painful to implement in every service. A service mesh moves this logic out of the application and into a sidecar proxy that runs alongside each service instance.
graph TD
subgraph Pod_A["Service A Pod"]
SA["Service A"] <-->|"localhost"| PA["Envoy Proxy"]
end
subgraph Pod_B["Service B Pod"]
SB["Service B"] <-->|"localhost"| PB["Envoy Proxy"]
end
subgraph Pod_C["Service C Pod"]
SC["Service C"] <-->|"localhost"| PC["Envoy Proxy"]
end
PA <-->|"mTLS"| PB
PA <-->|"mTLS"| PC
PB <-->|"mTLS"| PC
CP["Control Plane (Istiod)"] -->|"config"| PA
CP -->|"config"| PB
CP -->|"config"| PC
Service mesh architecture: sidecar proxies handle networking concerns. The control plane pushes configuration.
Istio and Linkerd are the two most common service mesh implementations. The sidecar proxy (typically Envoy) intercepts all inbound and outbound traffic. The control plane manages routing rules, security policies, and observability configuration. Your application code makes plain HTTP calls; the proxy handles mTLS, retries, circuit breaking, and distributed tracing headers transparently.
A service mesh is powerful but adds operational complexity. You are now running twice as many containers (every service gets a sidecar), debugging network issues requires understanding the proxy layer, and the control plane itself needs to be highly available. Adopt a mesh when you have more than 15-20 services and when the cross-cutting concern pain is real, not anticipated.
API gateway vs Backend for Frontend (BFF)
External clients should not call individual microservices directly. An API gateway sits at the edge and provides a single entry point, handling authentication, rate limiting, request routing, and response aggregation.
graph TD Web["Web App"] --> GW["API Gateway"] Mobile["Mobile App"] --> GW GW --> US["User Service"] GW --> OS["Order Service"] GW --> PS["Product Service"]
A single API gateway routes external traffic to internal services.
The problem with a single gateway is that different clients have different needs. A mobile app wants a compact payload optimized for bandwidth. A web dashboard wants rich, nested data. A third-party API consumer wants a stable, versioned contract. Cramming all of this into one gateway creates a bottleneck, both technically and organizationally.
The Backend for Frontend (BFF) pattern solves this by creating a dedicated backend for each client type.
graph TD Web["Web App"] --> WebBFF["Web BFF"] Mobile["Mobile App"] --> MobileBFF["Mobile BFF"] Partner["Partner API"] --> PartnerBFF["Partner BFF"] WebBFF --> US["User Service"] WebBFF --> OS["Order Service"] MobileBFF --> US MobileBFF --> PS["Product Service"] PartnerBFF --> OS PartnerBFF --> PS
BFF pattern: each client type gets its own backend that aggregates and tailors responses.
Each BFF is owned by the frontend team that consumes it. The web team controls what the web BFF returns, the mobile team controls the mobile BFF. This eliminates the coordination bottleneck of a shared gateway team and lets each client evolve independently.
Choosing the right pattern
There is no universal answer; the right pattern depends on the interaction.
Use synchronous request-response for real-time queries where the client needs an immediate answer, for simple CRUD operations, and for service-to-service calls within the same bounded context.
Use asynchronous events for notifications and side effects, for operations that span multiple services, for workloads where throughput matters more than latency, and for any interaction where you want to decouple the producer from the consumer.
Use sagas whenever a business process spans multiple services and requires all-or-nothing semantics. Default to orchestration for complex flows and choreography for simple ones.
Use a service mesh when cross-cutting networking concerns become a tax on every team. Use a BFF when different client types need fundamentally different API shapes.
What comes next
Services need to find each other before they can communicate. Static configuration breaks down as instances scale up and down dynamically. Continue with service discovery and registration to understand how services locate each other at runtime.