This problem appears in multiple sheets. Depth expectations increase as you progress:
| Track | What to demonstrate |
|---|---|
| Arch 25 | Pure infrastructure problem. Nail PoP hierarchy, cache hit ratio math, invalidation strategies, origin shield with consistent hashing, and geo-routing latency trade-offs. |
| Arch 50 | Add dynamic content caching (edge compute), TLS termination at edge, and cache poisoning / request smuggling defenses. |
| Arch 75 | Staff: design a CDN control plane (tenant isolation, config propagation to 200+ PoPs), and cost model for building vs buying at 10 Tbps peak. |
Interview Prompt
Design a Content Delivery Network (CDN) that caches static and cacheable dynamic content at edge locations worldwide. Origin servers sit in one or few regions; users worldwide should get low-latency responses with high cache hit ratio.
Clarifying Questions (ask before designing)
| Question | Why it matters |
|---|---|
| Static only, or dynamic/cacheable API responses too? | Static (images, JS, video segments) = long TTL, immutable URLs. Dynamic = short TTL or edge-side includes / cache keyed by Vary headers. |
| What's peak bandwidth and request rate? | 10 Tbps peak drives PoP count, edge server sizing, and origin shield necessity. 1M RPS drives connection handling architecture. |
| Single tenant or multi-tenant CDN-as-a-service? | Multi-tenant needs config isolation, per-tenant cache namespaces, and fair queuing on origin fetch. |
| Invalidation latency requirement? | Immediate purge (< 30 sec globally) vs eventual (TTL expiry) — fundamentally different control plane design. |
Scope
In scope
- PoP architecture and cache hierarchy
- Geo-routing and DNS/load balancing to nearest edge
- Origin shield and consistent hashing
- Cache invalidation (purge API, TTL, surrogate keys)
- Hit ratio optimization and capacity math
Out of scope (state explicitly)
- Full DNS provider design (assume GeoDNS exists)
- DDoS mitigation internals (WAF/scrubbing center as black box)
- Video-specific ABR (see #15)
- Origin application design
Assumptions
- 200+ PoPs globally, 10 Tbps peak aggregate egress
- Origin in us-east-1; 95% target cache hit ratio
- Mixed content: 70% static assets, 30% cacheable API
- Multi-tenant SaaS CDN serving 10K customers
These foundational concepts underpin the patterns used in this problem. Review them before deep-diving into component-level trade-offs.
- Cache and serve static content (images, videos, CSS, JS, fonts) from edge servers closest to the user
- Route user requests to the optimal edge server (lowest latency)
- If content not cached at edge, fetch from origin server ("cache miss")
- Support cache invalidation / purge on demand
- Support SSL/TLS termination at the edge
- Provide analytics: bandwidth usage, cache hit ratio, latency by region
- Support custom cache rules (TTL, cache key customization, query string handling)
- Low Latency: Serve content in < 50 ms from edge (vs 200+ ms from origin)
- High Availability: 99.99%: CDN failure degrades user experience globally
- Massive Scale: Serve 100+ Tbps of bandwidth globally
- Global Coverage: 200+ Points of Presence (PoPs) worldwide
- Cache Efficiency: > 90% cache hit ratio for popular content
- Fault Tolerant: Individual PoP failures should be transparent to users
- DDoS Resilient: Absorb volumetric attacks at the edge
| Metric | Calculation | Value |
|---|---|---|
| Total PoPs | Given (assumption documented in value) | 300 |
| Servers per PoP | Given (assumption documented in value) | 50-500 (varies by PoP size) |
| Total edge servers | Given (assumption documented in value) | 50,000 |
| Peak bandwidth | Given (peak load assumption) | 200 Tbps |
| Requests / sec | From Requests / day ÷ 86400 (+ peak factor in value) | 50M (globally) |
| Cache storage per PoP | Given (assumption documented in value) | 100 TB |
| Total cached content | 300 × 100 TB | 30 PB |
| Origin requests (10% miss rate) | Given (assumption documented in value) | 5M/sec |
DNS-Based Routing (GeoDNS)
- User's DNS resolver sends query for cdn.example.com
- CDN's authoritative DNS server looks up the resolver's IP location
- Returns the IP address of the closest/fastest PoP
- Pros: Simple, widely supported
- Cons: DNS TTL caching means slow failover; resolver IP ≠ user IP
Anycast Routing (Alternative/Complementary)
- Multiple PoPs announce the same IP address via BGP
- Network routing naturally sends packets to the closest PoP
- Pros: Instant failover, immune to DNS TTL issues
- Cons: No per-user control, relies on ISP routing tables
- Real-world: Cloudflare uses Anycast; AWS CloudFront uses GeoDNS
Edge Server (Cache Node)
- Reverse proxy (NGINX / custom): Handles TLS, request routing, caching
- Cache storage: Memory (RAM) for hot content (64 GB), SSD for warm content (2-10 TB), HDD for cold content (50+ TB)
- Cache lookup: Hash(cache_key) → check memory → check SSD → check HDD → cache miss
- LRU eviction: Least Recently Used items evicted when cache is full
- Consistent hashing: Within a PoP, requests are routed to specific servers based on content hash → avoids duplicate caching
Origin Shield (Mid-Tier Cache)
- Why: Without it, 300 PoPs each cache-miss independently → origin gets hammered with 300 requests for the same content
- How: Intermediate cache between PoPs and origin. All PoPs in a region route cache misses through the shield
- Benefit: Origin sees 3-5 cache-miss requests instead of 300
- Placement: 3-5 regional shields (US-East, US-West, Europe, Asia)
Cache Key
Default cache key: scheme + host + path + query_string https://cdn.example.com/images/logo.png?v=2 Customizable: - Ignore query string (for static assets) - Include cookies (for personalized content) - Include headers (Accept-Encoding, Accept-Language) - Include device type (mobile vs desktop)
Cache Control Headers
Cache-Control: public, max-age=86400, s-maxage=604800
public: CDN can cache
max-age: Browser cache TTL (1 day)
s-maxage: CDN/shared cache TTL (7 days)
Cache-Control: private, no-store
Don't cache at CDN (personalized content)
Vary: Accept-Encoding
Cache different versions for gzip vs brotliEvent Bus Design (Kafka)
Topic: cdn-events Partitions: 64 (scale consumers horizontally) Partition key: entity_id (user_id / order_id — preserves per-entity ordering) Retention: 7 days (compliance) or 24h (high-volume telemetry) Replication factor: 3, min.insync.replicas: 2 Producer: idempotent producer enabled (enable.idempotence=true) Consumer: consumer group "cdn-processors" - At-least-once delivery + idempotent handlers (dedup by event_id) - DLQ topic: cdn-events-dlq (poison messages after 3 retries) - Lag alert: consumer lag > 60s → scale workers Design a Content Delivery Network (CDN): async side effects MUST NOT block the synchronous API response. Sync path: validate → persist source of truth → publish event → return 201 Async path: consumers update caches, indexes, notifications, aggregates
Cache Purge
POST /api/v1/purge
{
"urls": ["https://cdn.example.com/images/logo.png"],
"pattern": "https://cdn.example.com/css/*"
}
Response: 202 Accepted
{
"purge_id": "purge-uuid",
"status": "propagating",
"estimated_completion": "30 seconds"
}Cache Warm
POST /api/v1/warm
{
"urls": ["https://cdn.example.com/videos/new-release.mp4"],
"regions": ["us-east", "eu-west", "ap-south"]
}Get Analytics
GET /api/v1/analytics?domain=cdn.example.com&period=24h
Response: 200 OK
{
"total_requests": 50000000,
"cache_hit_ratio": 0.93,
"bandwidth_gb": 15000,
"latency_p50_ms": 12,
"latency_p99_ms": 45,
"by_region": [...]
}Common Error Responses
400 Bad Request: invalid input, missing fields, or malformed JSON 401 Unauthorized: missing or invalid auth token or API key 403 Forbidden: authenticated but insufficient permissions 404 Not Found: resource ID does not exist 409 Conflict: duplicate write or version conflict; retry with idempotency key 422 Unprocessable Entity: valid syntax but invalid business logic 429 Too Many Requests: rate limit exceeded; honor Retry-After header 500 Internal Error: unexpected server fault; retry with idempotency key 503 Service Unavailable: dependency down or overloaded; use exponential backoff
Edge Server Cache Entry
Cache Key: "https://cdn.example.com/images/logo.png" Metadata: content_type: "image/png" content_length: 45678 etag: "abc123" last_modified: "2026-03-13T00:00:00Z" cache_control: "public, max-age=86400" expires_at: "2026-03-14T00:00:00Z" created_at: "2026-03-13T00:00:00Z" hit_count: 1523 last_accessed: "2026-03-13T10:30:00Z" Body: [binary content stored in SSD/memory]
DNS Routing Table
Region PoP IP Addresses Health US-East NYC [203.0.113.1, ...] healthy US-East IAD [203.0.113.5, ...] healthy EU-West LDN [198.51.100.1, ...] healthy AP-South MUM [192.0.2.1, ...] degraded
Purge Propagation
Kafka Topic: cache-purge
{
"purge_id": "uuid",
"pattern": "https://cdn.example.com/css/*",
"initiated_at": "2026-03-13T10:00:00Z",
"target_pops": ["all"]
}| Concern | Solution |
|---|---|
| PoP failure | DNS/Anycast routes to next closest PoP. Health checks every 10s |
| Edge server failure | Load balancer within PoP routes to healthy servers; consistent hashing rebalances |
| Origin failure | Serve stale content from cache (stale-while-revalidate, stale-if-error directives) |
| Cache stampede | Request coalescing: only one request to origin; all other waiters served from the same response |
| DDoS at edge | Rate limiting, WAF rules, TCP SYN cookies, challenge pages (CAPTCHA) at edge |
| Cable cut (region offline) | Anycast reroutes globally; regional failover to adjacent PoPs |
Specific: Cache Stampede / Thundering Herd
When a popular cached item expires, hundreds of concurrent requests trigger simultaneous origin fetches:
- Request coalescing: First request triggers origin fetch; subsequent requests wait for the result
- Stale-while-revalidate: Serve stale content while fetching fresh content in background
- Jittered TTL: Add random ±10% jitter to TTL → different PoPs expire at different times
Push vs Pull CDN
Pull CDN: Edge fetches from origin on first request (cache miss): best for dynamic sites, user-generated content.
Push CDN: Origin proactively pushes content to edge servers: best for known popular content (video, software updates).
Multi-CDN Strategy
- Use multiple CDN providers (Akamai + CloudFront + Fastly)
- Benefits: Redundancy, cost optimization, best performance per region
- Real-time switching: Traffic management layer routes to the best-performing CDN per user
Edge Computing
- Run application logic at the edge (Cloudflare Workers, AWS Lambda@Edge)
- Use cases: A/B testing, header manipulation, authentication, image resizing, personalization
- Reduces round trips to origin
TLS at Edge
- SSL/TLS terminated at edge server (not origin)
- Certificate management: Automatic certificate provisioning (Let's Encrypt) or customer certificate
- Edge-to-origin connection: Can be HTTP (internal network) or HTTPS (security)
Image Optimization at Edge
- Automatic format conversion (WebP/AVIF for supported browsers)
- Responsive sizing (resize based on Accept header or query param)
- Quality adjustment based on network speed
- Reduces bandwidth by 30-50%
Monitoring
- Real-time dashboards: requests/sec, bandwidth, cache hit ratio, error rate by PoP
- Alert on: cache hit ratio < 80%, origin error rate > 5%, latency p99 > 100ms
- Origin health monitoring: synthetic probes from each region
Interview Walkthrough
- Start with the read-heavy traffic profile and why origin offload is the primary goal — use Interview Patterns for cache hierarchy.
- Explain DNS-based vs Anycast routing to nearest PoP and how TTL affects failover during origin outages.
- Walk through cache key design (URL + Vary headers + query string policy) and negative caching for 404s.
- Cover cache invalidation strategies: TTL-only, active purge API, and versioned asset URLs for immutable content.
- Discuss origin shield (regional mid-tier cache) to collapse thundering herd on cache miss.
- Common pitfall: caching personalized HTML at the edge — without Vary or edge-side includes, users see each other's data.
Detailed Cache Miss Flow (Edge → Shield → Origin)
Total cold-miss latency: ~106 ms (DNS 0ms + TLS 0ms + Edge L1/L2/L3 miss 6ms + Consistent Hash 0.1ms + Origin Shield 15ms + Origin 80ms + backfill 5ms). Subsequent requests (cache hit): ~5 ms (RAM) or ~6 ms (SSD). Performance impact: 20x latency difference between hit and miss: this is why cache hit ratio (target > 90%) is the most critical CDN metric.
TLS 1.3 Handshake at Edge (Why Edge Termination Matters)
Without CDN: TLS to origin at 200ms RTT = 400ms before first byte (2 RTT). With CDN: TLS at edge at 5ms RTT = 10ms (cache hit). With TLS 1.3 0-RTT resumption: 5ms! Saving: 400ms → 5ms = 80x faster TLS. OCSP stapling at edge avoids client's separate OCSP check, saving another RTT.
Consistent Hashing Within a PoP
With 100 edge servers per PoP, without coordination the same URL could be cached on ALL 100 servers = 100x wasted storage. Consistent hashing maps URLs to specific servers via a hash ring with virtual nodes. Each URL is cached on exactly ONE server within the PoP. On server failure, only ~1/N keys need to re-cache.
Cache Invalidation Propagation Flow
Admin triggers purge → Purge Service validates → Kafka topic cache-purge → each PoP's purge consumer reads → fans out to all edge servers → each edge server deletes matching entries. Total propagation: < 1 second for exact URLs, < 5 seconds for glob patterns. Optimization: prefix tree index on cache keys for O(prefix_len) glob lookup, or tag-based purge for O(1) lookup.
Cache Warming vs On-Demand
On-Demand (Pull, default): first user request triggers cache miss, zero wasted cache space but first user in each PoP sees slow response. Pre-Warming (Push): proactively push content to edge, zero cold-start latency but wastes cache if content isn't popular. Recommended: hybrid: default on-demand pull + selective warm for known-popular content pre-launch.
Staff interviews expect you to articulate how the system evolves under real growth — not jump straight to the final architecture.
Phase 1 — Single-tier edge cache
Deploy nginx cache at 20 PoPs. GeoDNS routes to nearest. Origin in single region. TTL-based expiry only. Manual purge via SSH to each PoP (doesn't scale).
Key components: nginx edge cache · GeoDNS · Single origin · TTL-only invalidation
Move to next phase when: Origin bandwidth exceeds 100 Gbps on cache miss storm; manual purge takes 30 min
Phase 2 — Origin shield + purge API
Add regional origin shield layer with consistent hashing. Centralized purge API with pub/sub fanout. Surrogate-key invalidation. Anycast for top 20 PoPs. Cache key normalization rules per tenant.
Key components: Origin shield · Consistent hashing · Purge control plane · Anycast · Surrogate keys
Move to next phase when: 95% hit ratio target missed; compliance requires GDPR delete in < 60 sec
Phase 3 — Global CDN platform
200+ PoPs with L1/L2 hierarchy. Edge compute (Wasm) for dynamic personalization. Multi-origin with health-checked failover. Real-time analytics per tenant (hit ratio, bandwidth, status codes). Soft purge + stale-while-revalidate.
Key components: 3-tier cache hierarchy · Edge compute · Multi-origin failover · Tenant analytics · Soft purge
Move to next phase when: Enterprise customers need sub-10ms dynamic API caching with auth at edge
SLOs & Error Budgets
| Metric | Target | Rationale |
|---|---|---|
| Edge response p99 latency | < 50ms | Cache HIT path — no origin round-trip |
| Global cache hit ratio | > 95% | Origin cost and resilience |
| Purge propagation p99 | < 30 sec | Compliance and content freshness |
| PoP availability | 99.99% | Anycast failover should mask single-PoP failure |
Incident Scenarios (2am reality)
| Scenario | How you detect | Mitigation |
|---|---|---|
| Origin overload during coordinated TTL expiry | Origin 5xx rate spikes at top of every hour; shield miss ratio jumps to 40% | Enable stale-while-revalidate; jitter TTL per object (+/- 10%); increase shield singleflight timeout; emergency TTL extension via control plane |
| Bad config push clears all edge caches | Global hit ratio drops to 0%; origin bandwidth instant 100× spike | Config rollback via versioned deployments; canary PoP before global push; origin rate limiting + auto-scale; shield request coalescing |
| Cache poisoning via Host header manipulation | Users report seeing another tenant's content; security scan flags cross-tenant cache key collision | Include tenant_id in cache key namespace; validate Host header against tenant config; emergency flush of affected PoP partition |
Cost Drivers (Staff lens)
- Egress bandwidth: 10 Tbps peak × $0.01–0.05/GB depending on region — multi-billion $/year at scale
- PoP infrastructure: 200 PoPs × 50 servers × SSD cache — CapEx + colo fees dominate fixed cost
- Origin offload value: each 1% hit ratio improvement saves ~100 Gbps origin egress
Multi-Region & DR
PoPs are inherently multi-region. Origins may be multi-region with geo-aware shield routing (EU edges → EU origin). Cross-region cache fill via private backbone avoids public internet for inter-PoP transfer. Control plane active-active with CRDT-based config merge for partition tolerance.