What:
Application protocols (HTTP, gRPC, WebSocket) and naming backbones (DNS) that coordinate network data serialization and client-server routing.
Primary purpose:
Selecting the optimal communication contract based on payload budgets, connection latencies, and state dynamics.
Usually used for:
Microservice RPC, real-time push platforms, and public geo-located traffic routing.
How should I think about this inside system architectures?
🌐 Public HTTP Edge Gateway
Serve REST/JSON at the public API boundary to maximize public firewall compliance, client ease-of-use, and CDN caching compatibility.
⚡ Internal gRPC Microservices
Use gRPC (HTTP/2 binary streams) inside your private VPC network to maximize microservices performance and enforce strict type contracts.
🔌 Persistent Bidirectional Push
Avoid heavy client HTTP polling loop overhead by establishing durable, stateful TCP WebSocket channels for instant push events.
Needed When:
Designing microservices RPC meshes, low-latency mobile endpoints, push notifications, or geo-located entry endpoints.
Avoids:
High client HTTP polling spikes, payload parsing CPU overhead, connection handshake renegotiation latencies, and DNS staleness.
Optimizes For:
Payload bandwidth economy, connection handshake speeds, service contract safety, and geo-located entry path latency.
In production systems, clients resolve geo-routed edge endpoints via DNS, hitting public REST/WebSocket proxies that translate queries to gRPC internals:
Protocol Decision Tree
Systematically analyze connection dynamics to pick the correct application protocol:
- HTTP/REST: Stateless text format (JSON), native browser integration, robust CDN edge caching compatibility.
- gRPC (HTTP/2): Strongly-typed binary Protobuf contracts, full unary/bi-directional streaming support, and connection multiplexing.
- WebSocket vs Server-Sent Events (SSE) Matrix: Choosing real-time push backbones:
| Dimension | WebSocket | SSE |
|---|---|---|
| Direction | Bidirectional (Full-duplex) | Server → Client only |
| Transport Protocol | TCP persistent (HTTP Upgrade) | Standard HTTP (Long-lived stream) |
| Client Reconnection | Manual client logic | Built-in (EventSource auto-retry) |
| Data Serialization | Binary or Text (JSON) | UTF-8 Text only (needs Base64 for binary) |
| Benefit | Cost |
|---|---|
| gRPC Binary Serialization (Protocol Buffers over durable HTTP/2 streams reduce payload size by up to 90%) | No Native Browser Access (requires Envoy or gRPC-Web gateway proxy integration) |
| WebSocket Persistent Push (sub-millisecond full-duplex delivery without client polling overhead) | Memory-Heavy State (holding 1M open TCP sockets consumes gigabytes of server RAM) |
Problem: Each open WebSocket connection locks server RAM (~10-20KB). Failing to implement dead connection cleanup lets stale inactive sockets consume gigabytes of heap memory.
Mitigation: Enforce strict application heartbeats (ping/pong commands) and deploy rate limiters on concurrent open sockets per user.
Problem: Using DNS routing to failover to backup datacenters is restricted by client, ISP, and OS caching. Mismatched TTL policies mean traffic continues routing to offline servers for hours.
Mitigation: Configure short TTL values (~30s) or utilize Anycast routing to handle IP failovers transparently behind the same DNS address.
| Use Case | Problem Solved |
|---|---|
| Geo-routing | Return different IPs based on the client's physical DNS resolver location |
| Disaster Failover | Reroute user traffic to a backup region (restricted by client TTL caching delays) |
| Service Discovery | Internal endpoint routing (e.g. auth.internal.service) dynamically resolving to pod IPs |
- You are designing heavy microservices internal service meshes (gRPC).
- You must push bi-directional live stats or game data to browser clients (WebSocket).
- You are building public API gateways, web services, or developer-facing platforms (HTTP/REST).
- You need to route global customer traffic to the geographically nearest web cluster PoP (DNS Anycast).
- Proxy & Reverse Proxy Patterns: SSL termination and API routing gateways.
- Load Balancing Algorithms: Traffic steering at L4 (TCP) vs L7 (HTTP).
- CDN & Edge Delivery: Caching static assets at network edge gateways.
HTTP/2 Multiplexing Mechanics
Under standard HTTP/1.1, browsers are limited to a maximum of 6 concurrent TCP connections per domain. To download additional assets, requests must block in a queue (Head-of-Line Blocking at the connection level). gRPC leverages **HTTP/2** to establish a single persistent TCP connection per server, breaking data down into interleaved binary frames over multiple independent logical streams. This eliminates connection reuse latency entirely.
DNS Anycast Routing Internals
For public, high-volume entryways, systems deploy **Anycast Routing**. Instead of DNS returning different IPs depending on geographic locations, a single Anycast IP address is advertised globally via BGP (Border Gateway Protocol) routers. The global network automatically routes the client's TCP packets to the geographically closest datacenter announcing that IP, reducing connection setup times (RTT) by up to 90%.
API Design for Interviews
Protocol choice at the wire layer (HTTP vs gRPC vs WebSocket) is only half the story. Interviewers also probe how you shape the API contract itself — and how a **Proxy & Reverse Proxy** gateway sits in front to terminate TLS, route versioned paths, and translate public REST into internal gRPC.
REST vs GraphQL vs RPC
- REST: Resource-oriented URLs, cache-friendly GETs, simple for public third-party APIs. Weak when clients need wildly different field subsets from the same entity (over-fetching).
- GraphQL: Single endpoint, client-specified field selection. Excellent for mobile clients with bandwidth constraints. Cost: query complexity limits, harder CDN caching, N+1 resolver risk without DataLoader batching.
- RPC (gRPC / JSON-RPC): Action-oriented method calls with strict schemas. Best for internal microservice meshes where both sides control the contract. Requires a gateway proxy (Envoy, gRPC-Web) for browser clients.
Interview signal: public edge = REST/GraphQL behind an API gateway proxy; internal service-to-service = gRPC over a private mesh.
Cursor Pagination vs Offset
Offset pagination (?offset=10000&limit=20) degrades on large tables because the database must scan and discard 10,000 rows before returning the next page. **Cursor pagination** encodes the last seen sort key into an opaque token and queries with an indexed predicate like WHERE id > :cursor LIMIT 20 — constant-time regardless of page depth. Trade-off: cursors cannot jump to an arbitrary page number (page 47 of search results).
# Offset (slow at depth) GET /posts?offset=10000&limit=20 # Cursor (stable cost) GET /posts?cursor=eyJpZCI6MTIzfQ&limit=20 → SELECT * FROM posts WHERE id > 123 ORDER BY id LIMIT 20
Idempotency Keys
Network retries make duplicate POST requests inevitable. Clients send a unique Idempotency-Key header; the API gateway or service stores the key-to-response mapping in Redis for 24 hours. A retry with the same key returns the cached response instead of creating a second payment or order. Critical for payment and booking APIs where exactly-once semantics matter at the business layer.
API Versioning: URL vs Header
- URL path (/v2/users): Explicit, easy to route at the reverse proxy layer, visible in access logs. Most common in public REST APIs.
- Header (Accept: application/vnd.myapp.v2+json): Keeps URLs stable; useful when the resource path stays the same but the response shape evolves. Harder to debug from logs alone.
Either approach works — state your choice and note that the reverse proxy can route /v1/* to legacy pods and /v2/* to new deployments, letting you run both versions side-by-side during migration (see **Proxy & Reverse Proxy Patterns**).
For structured error envelopes, webhook delivery, batch APIs with partial success, and long-running job contracts, see API Contract & Integration Design — this section focuses on wire protocol and pagination mechanics at the transport layer.