Mar 3, 2026 · 13 min read · System Design

Proxies: forward and reverse

In this series (20 parts)

Every HTTP request your browser sends passes through at least one proxy before it reaches the origin server. Often it passes through several. Proxies are the invisible plumbing of the internet, and in production systems they are the layer where you enforce security policy, terminate TLS, shape traffic, and buy yourself time before requests ever touch application code. If you have read through rate limiting, you already know that throttling decisions have to happen somewhere. Proxies are that somewhere.

A proxy is any intermediary that sits between a client and a server, forwarding requests on behalf of one side or the other. The distinction between forward and reverse comes down to which side the proxy represents.

Forward proxies: acting on behalf of the client

A forward proxy sits in front of clients. The client sends its request to the proxy, and the proxy forwards it to the destination server. The server sees the proxy’s IP address, not the client’s. The client knows the proxy exists and is explicitly configured to use it.

Corporate networks are the classic example. An organization routes all outbound HTTP traffic through a forward proxy so it can enforce access policies, log requests, and filter content. When an employee visits a website, the request goes from their machine to the proxy, the proxy evaluates it against policy, and if approved, forwards it to the external server. The response follows the same path in reverse.

Forward proxies serve several purposes. Access control is the most obvious: block domains, restrict protocols, enforce authentication. Privacy is another: by masking client IPs behind the proxy’s address, you reduce the fingerprint visible to external servers. Caching is a third: if 200 employees request the same software update, the proxy can serve it from its local cache after the first download, saving bandwidth and latency.

VPNs function as a specialized form of forward proxy. They tunnel all client traffic through a remote endpoint, changing the apparent origin of requests. The core principle is identical: an intermediary acts on behalf of the client and hides the client’s true identity from the destination.

The important characteristic of a forward proxy is that the client is aware of it. The client’s network stack or browser is configured to route traffic through the proxy. The destination server, on the other hand, has no idea the proxy exists. It just sees a request from an IP address that happens to belong to the proxy.

Reverse proxies: acting on behalf of the server

A reverse proxy sits in front of servers. The client sends its request to what it thinks is the server, but it is actually hitting the proxy. The proxy then decides which backend server should handle the request and forwards it accordingly. The client has no idea the proxy exists.

This is the configuration that matters most in system design interviews and production architectures. When you type api.example.com into a browser, DNS resolves to the IP of a reverse proxy, not the application server. The proxy accepts the connection, inspects the request, and routes it to one of potentially hundreds of backend instances.

graph LR
  U1["Client A"] --> FP["Forward Proxy"]
  U2["Client B"] --> FP
  FP -->|"outbound"| Internet["Internet"]
  Internet --> RP["Reverse Proxy"]
  RP --> S1["App Server 1"]
  RP --> S2["App Server 2"]
  RP --> S3["App Server 3"]
  S1 --> DB["Database"]
  S2 --> DB
  S3 --> DB

Forward proxies represent clients (left side). Reverse proxies represent servers (right side). Clients configure forward proxies explicitly; they never know about reverse proxies.

Reverse proxies unlock a long list of capabilities that would be painful or impossible to implement at the application layer.

TLS termination

Handling TLS at the proxy means your backend servers deal with plain HTTP internally. This simplifies certificate management (one place to rotate certs), reduces CPU load on application servers (TLS handshakes are expensive, roughly 2 to 5 ms of CPU time per connection on modern hardware), and centralizes security policy. Nginx can terminate 30,000 TLS connections per second on a single 8-core machine. Your application servers never see encrypted traffic.

Load distribution

A reverse proxy is, in most deployments, also a load balancer. It distributes incoming requests across backend instances using algorithms like round-robin, least connections, or consistent hashing. The client connects to one address; the proxy fans out to many. This is how you scale horizontally without exposing internal topology to the outside world.

Compression and caching

Reverse proxies can compress responses before sending them to clients, reducing bandwidth by 60 to 80% for text-based content like JSON and HTML. They can also cache responses for static assets or even dynamic content with appropriate cache headers. This dovetails directly with CDN architecture, since CDNs are essentially geographically distributed reverse proxy caches.

Security and filtering

The proxy is the natural place to enforce rate limits, block malicious IPs, validate headers, and reject malformed requests before they reach your application code. A request that fails basic validation at the proxy layer costs you nearly nothing. The same request reaching your application server, parsing the body, hitting the database, and then failing costs you orders of magnitude more.

Request routing

In a microservices architecture, the reverse proxy routes requests to different backend services based on the URL path, hostname, or headers. A request to /api/users goes to the user service. A request to /api/orders goes to the order service. The client sees one unified API surface; behind the proxy, dozens of independent services handle the work.

Nginx and HAProxy: the workhorses

Two tools dominate the reverse proxy landscape for self-managed infrastructure: Nginx and HAProxy.

Nginx started as a web server designed to solve the C10K problem (handling 10,000 concurrent connections on a single machine). It uses an event-driven, non-blocking architecture. A single Nginx worker process can handle thousands of concurrent connections because it never blocks on I/O. Instead, it registers callbacks with the kernel’s event notification system (epoll on Linux, kqueue on BSD) and processes events as they arrive. In practice, a 4-core machine running Nginx can comfortably proxy 50,000 to 100,000 requests per second for typical HTTP workloads.

HAProxy is purpose-built for proxying and load balancing. It operates at both Layer 4 (TCP) and Layer 7 (HTTP). Where Nginx grew from a web server into a proxy, HAProxy was a proxy from day one. It excels at high-throughput TCP proxying, health checking, and connection draining. HAProxy can sustain over 2 million concurrent connections and process hundreds of thousands of requests per second on modest hardware. Its configuration is more specialized than Nginx’s, but for pure proxying and load balancing, it is hard to beat.

Both tools support hot reloading of configuration without dropping connections, which matters when you are deploying to production 50 times a day. Both support health checks that automatically remove unhealthy backends from the rotation. Both log detailed metrics that feed into monitoring systems. Choosing between them often comes down to whether you need Nginx’s broader feature set (static file serving, built-in scripting with Lua, native gRPC proxying) or HAProxy’s raw performance for TCP-level workloads.

In cloud-native environments, managed reverse proxies like AWS ALB, Google Cloud Load Balancer, and Azure Application Gateway handle the same responsibilities without requiring you to manage Nginx or HAProxy instances. The concepts are identical; the operational burden shifts to the cloud provider.

API gateways vs reverse proxies

An API gateway is a reverse proxy with opinions. It does everything a reverse proxy does (routing, TLS termination, load balancing) and adds application-level concerns on top: authentication, authorization, request transformation, response aggregation, schema validation, and developer portal integration.

The distinction matters because it determines where you draw architectural boundaries. A reverse proxy like Nginx operates at the infrastructure layer. It routes bytes efficiently and does not care about your business logic. An API gateway like Kong, AWS API Gateway, or Apigee operates at the application layer. It understands your API contracts, enforces per-client rate limits based on API keys, transforms request payloads between versions, and provides analytics dashboards showing which endpoints your customers use most.

For a small team running a handful of services, Nginx as a reverse proxy is sufficient. You configure routes, set up TLS, add basic rate limiting, and move on. For a platform team managing 200 microservices with external developer consumers, an API gateway provides the control plane you need. The gateway becomes the single enforcement point for authentication (validate JWT tokens before requests reach services), quota management (track usage per API key), protocol translation (accept REST from external clients, convert to gRPC for internal services), and versioning (route /v1/ and /v2/ to different service deployments).

The trap is adopting an API gateway before you need one. Gateways add latency (typically 5 to 15 ms per hop depending on the features enabled), operational complexity, and a single point of failure that requires its own high-availability setup. Start with a reverse proxy. Graduate to a gateway when the application-level concerns justify the cost.

Service meshes: proxies at every hop

A service mesh takes the proxy concept and distributes it across every service in your architecture. Instead of one central reverse proxy at the edge, each service instance gets its own proxy running as a sidecar process alongside the application container. This sidecar handles all inbound and outbound network traffic for the service.

Istio (using Envoy as the sidecar proxy) and Linkerd are the two dominant implementations. In a service mesh, when Service A calls Service B, the request flows from A’s application code to A’s sidecar proxy, across the network to B’s sidecar proxy, and then into B’s application code. Every hop is proxied.

This sounds like overhead, and it is. Each sidecar adds 1 to 3 ms of latency per hop and consumes 50 to 100 MB of memory. For a request that traverses five services, you add 10 to 30 ms of latency from the mesh alone. The benefit is that you get mutual TLS between all services (zero-trust networking), fine-grained traffic control (canary deployments, circuit breaking, retries with exponential backoff), and observability (distributed tracing, request-level metrics) without modifying any application code.

Service meshes make sense when you operate at scale with tens or hundreds of microservices and need uniform security and observability guarantees. For a system with five services, the mesh overhead is not worth it. A reverse proxy at the edge plus application-level libraries for retries and circuit breaking will serve you better.

Choosing the right proxy layer

The decision tree is straightforward. If you need to control outbound traffic from clients, you want a forward proxy. If you need to protect and distribute traffic to backend servers, you want a reverse proxy. If you need application-level API management, you want an API gateway. If you need uniform security and observability across a large microservices deployment, you want a service mesh.

In practice, production systems combine multiple layers. A typical request path looks like this: the client’s corporate forward proxy sends the request to a CDN edge node (a geographically distributed reverse proxy cache), which forwards cache misses to an API gateway at the origin, which routes to the correct microservice, where a service mesh sidecar handles the final delivery. Four proxy hops, each serving a different purpose.

The guiding principle is to push concerns as far upstream as possible. Reject bad traffic at the edge, not at the database. Cache responses at the CDN, not at the application server. Terminate TLS at the reverse proxy, not in your business logic. Each layer of proxying exists to prevent unnecessary work from reaching the layers behind it.

Latency cost increases with each proxy layer. Reverse proxies add negligible overhead. API gateways and service meshes trade latency for functionality.

Common mistakes

Putting business logic in the proxy layer is the most frequent error. Proxies should route, filter, and transform. They should not query databases, call external APIs, or implement domain rules. The moment your Nginx config contains embedded Lua that joins user records, you have blurred the line between infrastructure and application.

Running a single proxy instance without redundancy is another. Your proxy is the front door to your entire system. If it goes down, everything behind it is unreachable. Always run at least two instances with health checking and failover. For Nginx, this means an active-passive pair with keepalived or an active-active pair behind DNS round-robin. For cloud load balancers, redundancy is built in.

Ignoring proxy metrics is the third. Your proxy sees every request. It knows the request rate, error rate, latency distribution, and backend health. If you are not shipping those metrics to your monitoring stack, you are flying blind. Nginx’s stub_status module and HAProxy’s stats page are starting points, but production systems need full integration with Prometheus, Datadog, or equivalent.

What comes next

Proxies operate at the network layer, and understanding how that layer works in depth gives you leverage over everything we have discussed. Networking concepts covers DNS resolution, TCP handshakes, connection pooling, and the mechanics that underpin every proxy interaction. That knowledge turns proxy configuration from guesswork into engineering.

← Back to all series