Proxy Server

Step 4 in the System Design path · 3 concepts · 0 problems

0 / 3 complete

📘 Learn Proxy Server from zero

A proxy server is an intermediary that sits between a client and the destination server — instead of talking to the origin directly, the client sends its request to the proxy, which forwards it on, receives the response, and relays it back. In a system design interview this matters because proxies are the building block behind load balancers, CDNs, API gateways, and security perimeters. Knowing the mechanism (forward vs reverse proxy, what each hop sees, what gets cached) lets you justify why you'd place one in your design and what it costs you in latency and operational complexity.

✨ Added by the guide to build intuition — not from the source course.

Lessons in this topic

🏗️ Apply it — design walkthrough

Work through this after you've learned the concepts in the lessons above.

What problem does it solve?

🤔 A client could just open a TCP connection straight to the origin server. Why insert a middlebox that adds an extra network hop?

Reveal the reasoning

Cause → effect chain:

Without a proxy, every client talks directly to the origin → the origin's real IP is exposed, every client must be reachable by the origin, and there's no shared place to cache, filter, or balance traffic.
Insert a proxy → it becomes a single control point the architecture can attach behavior to: caching, access control, TLS termination, rate limiting, logging, and routing all live in one place.
Concrete: a proxy fronting 10 backend servers lets you add a new backend, block an IP range, or rotate a TLS cert once at the proxy instead of on 10 machines.

Trade-off / cost: you've added a network hop (typically sub-millisecond to ~1 ms within a datacenter; more if it terminates TLS or queues) and a new component that can fail. The proxy is now on the critical path, so it must be made highly available (redundant instances behind a VIP/DNS) or it becomes a single point of failure.

Forward vs reverse?

🤔 A corporate web filter and an Nginx load balancer in front of an API are both "proxies." One acts on behalf of the client, the other on behalf of the server. Which is which, and who configures each?

Reveal the reasoning

The side it represents is the whole distinction:

Forward proxy → sits in front of clients and acts on their behalf, configured by the client side. The origin server sees the proxy's IP, not the user's. Example: a company routes all employee traffic through a proxy to filter sites and hide internal IPs. client → forward proxy → internet → server.
Reverse proxy → sits in front of servers and acts on their behalf, configured by the server owner. The client thinks it's talking to the real server; it's actually hitting the proxy. Example: Nginx/Cloudflare in front of your backend doing TLS termination and load balancing. client → internet → reverse proxy → server pool.

Memory hook: a forward proxy hides the client from the server; a reverse proxy hides the servers from the client.

Trade-off / cost: a forward proxy only governs clients configured (or forced via network policy) to use it — clients with another route can bypass it. A reverse proxy must scale with all inbound traffic and becomes the public face of your whole system, so its capacity and availability bound the entire service.

How a request flows

🤔 Walk through a single GET request through a reverse proxy. What does the proxy actually do at each hop, and what does the backend see as the source IP?

Reveal the reasoning

Step-by-step mechanism:

1. Accept: client opens a connection to the proxy's public IP and sends GET /page.
2. Terminate / inspect: if TLS is terminated here, the proxy decrypts, reads the request, and can apply rules (auth, rate limit, routing by path/host).
3. Forward: the proxy opens its own connection to a chosen backend and replays the request. The backend now sees the proxy's IP as the source — the original client IP is preserved only if the proxy adds an X-Forwarded-For header (or uses the PROXY protocol).
4. Relay: backend responds → proxy receives it → proxy returns it to the client over the original client↔proxy connection.

So there are two separate TCP connections (client↔proxy and proxy↔backend), which is what lets the proxy pool and reuse backend connections across many clients instead of opening one per client.

Trade-off / cost: terminating TLS means the proxy sees plaintext — it's now a high-value target and must be secured. And because the backend sees the proxy IP, naive IP-based logging/rate-limiting on the backend breaks unless it reads X-Forwarded-For — which is client-spoofable unless the proxy overwrites it and the backend only trusts the value set by that known proxy hop.

Why cache here?

🤔 If 1,000 users request the same logo image, why is caching it at a reverse proxy dramatically better than letting each request hit the backend? Put a number on the win.

Reveal the reasoning

Cause → effect with numbers:

Without caching: 1,000 requests → 1,000 trips to the backend. If the backend takes 50 ms to serve the image, the backend does 1,000× the work and each user waits ~50 ms (plus network).
With proxy caching: the first request is a miss (~50 ms, fetched from backend and stored). The next 999 are hits served from the proxy's memory/disk in ~1–2 ms.
Effect: backend load for that asset drops from 1,000 requests to 1 request (a 99.9% reduction), and latency for the cached asset falls from ~50 ms to ~1–2 ms.

This is exactly how a CDN works — it's a globally distributed reverse-proxy cache placed close to users.

Trade-off / cost — staleness: if you update the logo, the proxy keeps serving the old one until its TTL expires. You now must manage cache invalidation (TTLs, cache-busting URLs like logo.v2.png, or explicit purges). Also beware the cache stampede: when a hot key's TTL expires, many concurrent requests miss at once and all hit the backend together — mitigated by request coalescing (one request fetches while the rest wait) or staggered/jittered TTLs.

Load balancer = proxy

🤔 An interviewer says "a layer-7 load balancer is just a special reverse proxy." Is that true, and what extra job does the balancing add on top of plain proxying?

Reveal the reasoning

For an L7 (HTTP) balancer, largely true — the proxying mechanism is the same, the selection logic is the addition:

A plain reverse proxy forwards every request to one configured backend.
A load balancer is a reverse proxy that, on each request, picks a backend from a pool using an algorithm — round-robin, least-connections, or hashing on a key.
Concrete: with 3 healthy backends and round-robin, requests go 1→2→3→1→2→3… When the proxy's health check finds backend 2 failing, it's removed from the pool and traffic splits across the remaining 2, so a dead server stops receiving traffic within seconds instead of returning errors.

(Caveat: an L4 load balancer forwards packets/connections rather than terminating and re-issuing HTTP, so it isn't a full reverse proxy — the equivalence holds cleanly at L7.)

Trade-off / cost — session affinity: if a user's session state lives on backend 1, round-robin sends their next request to backend 2, which doesn't have it → the user gets logged out. You then need sticky sessions (hash on user/cookie so they always hit the same backend), but stickiness skews load distribution and complicates failover (a downed backend loses its users' state anyway). The clean fix is to make backends stateless (store sessions in shared Redis), which is why "keep your servers stateless" is a recurring interview answer.

When NOT to use one

🤔 Proxies aren't free wins. Name a scenario where adding a proxy is the wrong call, and explain the cost that outweighs the benefit.

Reveal the reasoning

Reach for the cost side of the ledger:

Ultra-low-latency / latency-sensitive paths: every proxy hop adds a small but real delay plus a buffering/serialization step. For a high-frequency trading feed or a real-time game's hot loop, even sub-millisecond overhead and added jitter can be unacceptable — you want direct connections.
Tiny / single-server systems: if you have one backend and no caching, security, or balancing need, a reverse proxy adds an operational component to monitor, patch, and keep available for zero functional gain. It's premature infrastructure.
End-to-end encryption requirements: if a TLS-terminating proxy would decrypt traffic it shouldn't see (e.g. zero-trust or regulated payloads), the proxy becomes a compliance liability; you'd use a pass-through/TCP (L4) proxy that never decrypts — giving up the inspection and caching features that require plaintext.

Rule of thumb: add a proxy when you need a shared control point (cache, balance, secure, route) across many backends or clients. If you can't name the specific job it does, the extra hop, the extra failure mode, and the extra ops burden aren't justified.

VPN vs Proxy

🤔 Both a VPN and a forward proxy can "hide your IP and route you through a server." So what's the actual technical difference — and why is a typical proxy not a substitute for a VPN for privacy?

Reveal the reasoning

The difference is the layer they operate at and what they cover:

Proxy = application-layer (L7), usually per-app. It intercepts traffic for a specific protocol/app (e.g. just your browser's HTTP via a configured proxy setting). Other apps on the machine still go direct. A plain HTTP proxy does not by itself encrypt the link between you and the proxy.
VPN = network-layer (L3), system-wide. It creates an encrypted tunnel that captures all the device's IP traffic — every app, every protocol — and routes it through the VPN server.

Why a proxy isn't a privacy substitute: it leaves non-proxied apps and protocols (DNS, other ports) exposed, and unless it's an encrypting tunnel the hop to the proxy can be observed on the local network. A VPN's encryption and full-device capture close both gaps. (Note: HTTPS still encrypts the payload either way; the distinction is about which traffic is tunneled and whether the link to the intermediary itself is encrypted.)

Trade-off / cost: a VPN's system-wide encrypted tunnel adds per-packet crypto overhead and routes everything through one server, which can reduce throughput and add latency, and it's an all-or-nothing path. A per-app proxy is lighter and selective but offers weaker, narrower protection. Choose by goal: selective app routing/filtering → proxy; full-device confidentiality → VPN.

📐 Architecture diagrams (3)

🎯 Guided practice

Easy — Classify the proxy. "A company routes all employee web browsing through a server that blocks gambling sites and logs visited URLs." Which type is it, and why?
Reasoning: Ask the diagnostic question — whom is the middleman acting for? Here it acts for the clients (employees) and faces outward to the internet, so it is a forward proxy. Filtering (block gambling) and access logging are textbook forward-proxy uses. Contrast: if the same box sat in front of the company's own web servers, deciding which backend serves an incoming request, it would be a reverse proxy. Pattern learned: identify the side being shielded — clients ⇒ forward, servers ⇒ reverse.
Medium — Design with a reverse proxy. A photo site has 5 identical backend servers and serves the same set of popular images to millions of users over HTTPS. Where do you place a proxy and what three jobs does it do?
Reasoning: Put a reverse proxy in front of the 5 backends. (1) Single entry point + load balancing: clients hit one public address; the proxy distributes requests (round-robin or least-connections) and lets you add/remove backends transparently. (2) TLS termination: terminate HTTPS at the proxy so the backends skip repeated crypto and you manage one certificate. (3) Caching: cache the popular images at the proxy so most reads never reach a backend — converting a cross-network origin round-trip into a fast local hit. Then stress-test it: the proxy is now a single point of failure, so you replicate it behind a VIP/DNS, and you tune cache TTL / invalidation to avoid serving stale images. Pattern learned: a reverse proxy is the natural home for routing + security + caching — exactly the layer Alex Xu inserts between clients and the web tier when scaling a single server into a tier.
Hard — Proxy or VPN? Remote employees must reach an internal admin dashboard and several internal databases and SSH boxes, with all of that traffic confidential over the public internet. A teammate proposes "just put a reverse proxy on the internet in front of the dashboard." Is that sufficient? What changes if the requirement is "anonymize one team's outbound web research from third-party sites"?
Reasoning: A reverse proxy only fronts the services you explicitly expose over HTTP(S) — it terminates TLS for that dashboard but does nothing for arbitrary database and SSH traffic, and it offers no client-side confidentiality for the rest of the device's connections. The requirement here is "encrypt and route all of a device's traffic to the internal network," which is exactly a VPN: it operates at the network layer (L3), captures all traffic system-wide, and always encrypts it in a tunnel — covering HTTP, DB, and SSH uniformly. The proxy works at L5–L7, per-application, and does not inherently encrypt. For the second scenario — anonymizing one team's outbound web research from destination sites — a forward proxy is the right tool: it masks the source IP toward third parties and can cache/log, without the cost of full-tunnel encryption of every connection. Pattern learned: match the tool to the layer and scope of the requirement — per-app routing/gateway ⇒ proxy (forward for clients, reverse for servers); whole-device encrypted tunnel ⇒ VPN.

✨ Added by the guide — work these before the full problem set.