Search…
High Level Design · Part 2

Microservice communication patterns

In this series (12 parts)
  1. Monolith vs microservices
  2. Microservice communication patterns
  3. Service discovery and registration
  4. Event-driven architecture
  5. Distributed data patterns
  6. Caching architecture patterns
  7. Search architecture
  8. Storage systems at scale
  9. Notification systems
  10. Real-time systems architecture
  11. Batch and stream processing
  12. Multi-region and global systems

Once you split a monolith into services, every function call that crossed a module boundary becomes a network call. That single change introduces latency, partial failure, and the need for explicit contracts between teams. The communication pattern you choose shapes your system’s reliability, coupling, and operational complexity more than almost any other architectural decision.

Synchronous: request-response

The simplest pattern is a direct HTTP or gRPC call. Service A sends a request to Service B and blocks until it gets a response. This feels natural because it mirrors how functions work inside a monolith, but the similarity is deceptive.

sequenceDiagram
  participant Client
  participant OrderSvc as Order Service
  participant InvSvc as Inventory Service
  participant PaySvc as Payment Service
  Client->>OrderSvc: POST /orders
  OrderSvc->>InvSvc: Check stock
  InvSvc-->>OrderSvc: Stock available
  OrderSvc->>PaySvc: Charge card
  PaySvc-->>OrderSvc: Payment confirmed
  OrderSvc-->>Client: Order created

Synchronous chain: the client waits while the order service calls inventory and payment sequentially.

The total latency is the sum of all downstream calls plus network overhead. If the inventory check takes 50ms and the payment call takes 200ms, the client sees at least 250ms plus serialization and network time. Worse, if any service in the chain is down, the entire request fails. This temporal coupling means your system’s availability is the product of each service’s availability. Three services at 99.5% uptime give you 98.5% combined.

Synchronous communication works well for queries where the client genuinely needs the response before proceeding. It becomes problematic for commands where you are triggering side effects across multiple services.

Asynchronous: event-driven

The alternative is to decouple services through message queues. Instead of calling a service directly, you publish an event to a broker. Interested services consume events at their own pace.

graph LR
  OS["Order Service"] -->|"OrderPlaced"| MQ["Message Broker"]
  MQ -->|"consume"| IS["Inventory Service"]
  MQ -->|"consume"| PS["Payment Service"]
  MQ -->|"consume"| NS["Notification Service"]
  IS -->|"StockReserved"| MQ
  PS -->|"PaymentProcessed"| MQ

Event-driven: the order service publishes an event and moves on. Consumers process independently.

This eliminates temporal coupling. The order service does not need the inventory or payment service to be running at the moment the order is placed. Events queue up in the broker and get processed when the consumer is ready. Adding a new consumer (say, an analytics service) requires zero changes to the producer.

The tradeoff is that you lose the immediate feedback loop. The client gets an acknowledgment that the order was accepted, but the actual processing happens asynchronously. You need to design for eventual consistency and provide mechanisms for the client to check status later, through polling, webhooks, or server-sent events.

The saga pattern for distributed transactions

In a monolith, you wrap multiple operations in a database transaction: debit the account, create the order, reserve inventory, all or nothing. In a distributed system, you cannot use a single transaction across services. Each service owns its own database, and two-phase commit (2PC) is fragile and slow at scale.

The saga pattern replaces a single transaction with a sequence of local transactions, each with a compensating action that undoes its effect if a later step fails. There are two coordination approaches: choreography and orchestration.

Choreography

In choreography, each service listens for events and decides what to do next. There is no central coordinator. The flow emerges from the event chain.

sequenceDiagram
  participant OS as Order Service
  participant IS as Inventory Service
  participant PS as Payment Service
  participant NS as Notification Service
  OS->>OS: Create order (PENDING)
  OS-)IS: OrderCreated event
  IS->>IS: Reserve stock
  IS-)PS: StockReserved event
  PS->>PS: Process payment
  PS-)NS: PaymentProcessed event
  PS-)OS: PaymentProcessed event
  NS->>NS: Send confirmation
  OS->>OS: Update order (CONFIRMED)
  Note over PS,OS: If payment fails...
  PS-)IS: PaymentFailed event
  IS->>IS: Release stock (compensate)
  IS-)OS: StockReleased event
  OS->>OS: Update order (CANCELLED)

Choreographed saga: services react to events and publish their own. Compensation flows backward on failure.

Choreography works well for simple flows with few steps. It avoids a single point of failure since there is no orchestrator to go down. The downside is that the business logic is spread across multiple services, making it hard to understand the full flow by reading any single codebase.

Orchestration

In orchestration, a dedicated saga orchestrator tells each service what to do and handles the compensation logic centrally.

sequenceDiagram
  participant Orch as Order Saga Orchestrator
  participant IS as Inventory Service
  participant PS as Payment Service
  participant NS as Notification Service
  Orch->>IS: Reserve stock
  IS-->>Orch: Stock reserved
  Orch->>PS: Process payment
  PS-->>Orch: Payment processed
  Orch->>NS: Send confirmation
  NS-->>Orch: Confirmation sent
  Note over Orch,PS: If payment fails...
  Orch->>IS: Release stock (compensate)
  IS-->>Orch: Stock released
  Orch->>Orch: Mark order cancelled

Orchestrated saga: a central coordinator drives the flow and handles compensation.

Orchestration makes the business flow explicit and testable in one place. The orchestrator is a state machine that knows every step and every compensating action. The cost is a central component that becomes a coordination bottleneck and a single point of failure if not designed for reliability.

Most production systems use orchestration for complex flows (five or more steps) and choreography for simple, two-to-three-step interactions.

Service mesh

As the number of services grows, cross-cutting concerns like retries, timeouts, mutual TLS, and traffic shaping become painful to implement in every service. A service mesh moves this logic out of the application and into a sidecar proxy that runs alongside each service instance.

graph TD
  subgraph Pod_A["Service A Pod"]
      SA["Service A"] <-->|"localhost"| PA["Envoy Proxy"]
  end
  subgraph Pod_B["Service B Pod"]
      SB["Service B"] <-->|"localhost"| PB["Envoy Proxy"]
  end
  subgraph Pod_C["Service C Pod"]
      SC["Service C"] <-->|"localhost"| PC["Envoy Proxy"]
  end
  PA <-->|"mTLS"| PB
  PA <-->|"mTLS"| PC
  PB <-->|"mTLS"| PC
  CP["Control Plane (Istiod)"] -->|"config"| PA
  CP -->|"config"| PB
  CP -->|"config"| PC

Service mesh architecture: sidecar proxies handle networking concerns. The control plane pushes configuration.

Istio and Linkerd are the two most common service mesh implementations. The sidecar proxy (typically Envoy) intercepts all inbound and outbound traffic. The control plane manages routing rules, security policies, and observability configuration. Your application code makes plain HTTP calls; the proxy handles mTLS, retries, circuit breaking, and distributed tracing headers transparently.

A service mesh is powerful but adds operational complexity. You are now running twice as many containers (every service gets a sidecar), debugging network issues requires understanding the proxy layer, and the control plane itself needs to be highly available. Adopt a mesh when you have more than 15-20 services and when the cross-cutting concern pain is real, not anticipated.

API gateway vs Backend for Frontend (BFF)

External clients should not call individual microservices directly. An API gateway sits at the edge and provides a single entry point, handling authentication, rate limiting, request routing, and response aggregation.

graph TD
  Web["Web App"] --> GW["API Gateway"]
  Mobile["Mobile App"] --> GW
  GW --> US["User Service"]
  GW --> OS["Order Service"]
  GW --> PS["Product Service"]

A single API gateway routes external traffic to internal services.

The problem with a single gateway is that different clients have different needs. A mobile app wants a compact payload optimized for bandwidth. A web dashboard wants rich, nested data. A third-party API consumer wants a stable, versioned contract. Cramming all of this into one gateway creates a bottleneck, both technically and organizationally.

The Backend for Frontend (BFF) pattern solves this by creating a dedicated backend for each client type.

graph TD
  Web["Web App"] --> WebBFF["Web BFF"]
  Mobile["Mobile App"] --> MobileBFF["Mobile BFF"]
  Partner["Partner API"] --> PartnerBFF["Partner BFF"]
  WebBFF --> US["User Service"]
  WebBFF --> OS["Order Service"]
  MobileBFF --> US
  MobileBFF --> PS["Product Service"]
  PartnerBFF --> OS
  PartnerBFF --> PS

BFF pattern: each client type gets its own backend that aggregates and tailors responses.

Each BFF is owned by the frontend team that consumes it. The web team controls what the web BFF returns, the mobile team controls the mobile BFF. This eliminates the coordination bottleneck of a shared gateway team and lets each client evolve independently.

Choosing the right pattern

There is no universal answer; the right pattern depends on the interaction.

Use synchronous request-response for real-time queries where the client needs an immediate answer, for simple CRUD operations, and for service-to-service calls within the same bounded context.

Use asynchronous events for notifications and side effects, for operations that span multiple services, for workloads where throughput matters more than latency, and for any interaction where you want to decouple the producer from the consumer.

Use sagas whenever a business process spans multiple services and requires all-or-nothing semantics. Default to orchestration for complex flows and choreography for simple ones.

Use a service mesh when cross-cutting networking concerns become a tax on every team. Use a BFF when different client types need fundamentally different API shapes.

What comes next

Services need to find each other before they can communicate. Static configuration breaks down as instances scale up and down dynamically. Continue with service discovery and registration to understand how services locate each other at runtime.

Start typing to search across all content
navigate Enter open Esc close