This problem appears in multiple sheets. Depth expectations increase as you progress:
Interview Prompt
Design Map Rendering and Navigation System like Google Maps.
Clarifying Questions (ask before designing)
| Question | Why it matters |
|---|---|
| Which of these is highest priority: Tile pyramid (zoom levels), Vector vs raster tiles, Dijkstra/A* with contraction hierarchies? | Forces scope negotiation — senior candidates trim before drawing boxes. |
| What scale should we design for — DAU, QPS, data volume? | Drives every capacity decision; shows structured thinking. |
| What are the read vs write patterns on the critical path? | Determines caching, DB choice, and replication topology. |
| What consistency and durability guarantees are required? | Separates strong-consistency paths from eventual ones — a senior differentiator. |
Scope
In scope
- Tile pyramid (zoom levels)
- Vector vs raster tiles
- Dijkstra/A* with contraction hierarchies
- Offline maps
- Turn-by-turn state machine
- Capacity estimation with shown math
Out of scope (state explicitly)
- Ride dispatch and matching (#19)
- Map tile data collection / OpenStreetMap ingestion
- Offline maps storage on device
Assumptions
- Clarify scale (DAU, QPS, data volume) for map rendering navigation in the first 5 minutes
- Standard reliability target 99.9%–99.99% unless problem implies higher (payments, booking)
- Managed cloud services (RDS, S3, Kafka, Redis) are acceptable building blocks
These foundational concepts underpin the patterns used in this problem. Review them before deep-diving into component-level trade-offs.
- Map rendering: Display interactive, zoomable, pannable maps with satellite, terrain, and traffic layers.
- Geocoding: Convert human-readable addresses to geospatial coordinates and vice versa (reverse geocoding).
- Route planning: Calculate optimal routes between origin and destination across multiple modes (driving, walking, cycling, transit).
- Turn-by-turn navigation: Real-time guided navigation with voice prompts, route progression, and lane guidance.
- Live traffic: Display real-time traffic conditions and dynamically reroute when congestion or incidents are detected.
- Search places: Query points of interest (POIs like restaurants, gas stations, hospitals) by name, category, or distance.
- ETA estimation: Accurately estimate time of arrival considering real-time traffic speeds, road class, and speed limits.
- Offline maps: Download designated map regions to enable basic offline search, rendering, and route computation.
- Street View: Provide 360° panoramic street-level imagery.
- User contributions: Support real-time incident reports (accidents, speed traps, road closures).
- Low Latency: Interactive map tiles must render in < 200 ms. Route planning queries must return in < 1 second.
- High Availability: 99.99% availability (navigation failure while driving is highly disruptive and unsafe).
- Scalability: Support 1B+ MAU, 300M DAU, and 50M+ concurrent active navigation sessions.
- Accuracy: Route paths must strictly follow road rules; ETA estimations must stay within ±10% margin of actual time.
- Freshness: Traffic telemetry must aggregate and refresh every 30–60 seconds.
- Bandwidth Efficiency: Minimize cellular data transfer (mandatory use of compact Vector protobuf tiles instead of raw Raster images).
- Global Coverage: Scale to every country, supporting regional rules, traffic signs, and localized address formats.
| Metric | Calculation | Value |
|---|---|---|
| DAU | 300 Million | |
| Map tile requests / sec | Multiple tiles per pan/zoom | ~2 Million |
| Route calculation requests / sec | 100K | |
| Active navigation sessions | Concurrently active | 50 Million |
| Traffic data points / sec | GPS telemetries sent every 5s | 10 Million |
| Map data total size (raw) | Vector roads, points of interest | ~20 TB |
| Tile cache (all zoom levels) | Pre-rendered + cached vector | ~500 TB |
| Road graph (global) | Nodes and edges representation | ~2B nodes, ~5B edges |
I/O Estimations: 1. Map tile bandwidth: - 2M tile requests/sec × 10 KB avg vector tile size = 20 GB/s egress bandwidth (handled by CDNs). 2. Traffic telemetry data ingest: - 10M GPS updates/sec × 100 bytes/update = 1 GB/s ingress. Kafka handles this with dynamic partitioning.
The architecture divides the system into three main engines: Tile Service (renders visual map partitions),Routing & Navigation Engine (computes shortest/optimal paths), and Traffic Service (collects GPS telemetry and computes edge weights).
1. Map Tile System: Pyramids & Vector Formats ⭐
The world map is represented as a tile pyramid. Zoom Level 0 covers the entire planet in one 256×256 px tile. Each subsequent zoom level quadruples the number of tiles (2z × 2z grid). Zoom 18 street-level contains ~69 billion tiles.
Vector Tiles vs Raster Tiles:
| Aspect | Raster Tiles (Traditional) | Vector Tiles (Modern) ⭐ |
|---|---|---|
| Data Content | Pre-rendered static PNG/JPEG images | Raw geometric coordinates (roads, buildings, labels) |
| Client Effort | Low (displays pre-drawn pixels) | High (WebGL client GPU rendering) |
| Dynamic Rotation | Poor (blurry/pixelated text) | Flawless (vectors re-align instantly) |
| Styling / Themes | Requires full server-side re-render | Instant (client stylesheet swap) |
| Network Size | ~20–50 KB per tile | ~5–15 KB per tile (Google Protobuf MVT) |
| Offline Feasibility | Extremely bulky (hundreds of GBs) | Highly viable (highly compressed geometries) |
2. Routing Engine: Algorithms & Customizations ⭐
The road network is modeled as a weighted directed graph (~2B nodes, ~5B edges). Since a standard Dijkstra search would take minutes for continental paths, optimized variations are deployed:
- Contraction Hierarchies (CH): A pre-processing step that eliminates "unimportant" minor nodes (cul-de-sacs, residential alleys) and inserts "shortcut" edges. Bidirectional Dijkstra searches UP the hierarchy from both origin and destination. Yields sub-millisecond continental searches. However, CH handles real-time traffic updates poorly due to long pre-processing recalculation times.
- Customizable Route Planning (CRP): Divides the global road network graph into nested geometric cells. Pre-computes boundary-to-boundary clique paths within each partition. When live traffic changes edge speeds, only affected cells are re-evaluated, keeping queries exceptionally fast.
- ALT (A*, Landmarks, Triangle Inequality): Uses pre-selected landmark nodes to bound heuristics, offering a flexible middle-ground that easily supports live weight updates.
3. Traffic Service: Real-Time Map Matching (HMM) ⭐
Telemetric GPS coordinates sent by mobile clients are inherently noisy (±10m drift). Snapping these coordinates to the road graph requires the Hidden Markov Model (HMM).
If a driver travels near an elevated freeway overpass and a parallel minor service road below, a simple "closest segment" metric might snap them to the service road. An HMM considers the sequence of past transitions. If the vehicle is registering speeds of 100 km/h, the transition matrix dictates that they must be on the highway. Flink aggregates segment speeds in 60s windows.
4. Navigation Service: Maneuvers & Rerouting
The initial route is computed server-side due to complete access to historical/live traffic data. The client monitors progression locally (allowing offline guidance). If the GPS coordinates deviate > 50m from the planned path, a local routing fallback triggers immediate rerouting, while asynchronously updating the server for optimal traffic adjustments.
Event Bus Design (Kafka)
Topic: map_rendering_navigation-events Partitions: 64 (scale consumers horizontally) Partition key: entity_id (user_id / order_id — preserves per-entity ordering) Retention: 7 days (compliance) or 24h (high-volume telemetry) Replication factor: 3, min.insync.replicas: 2 Producer: idempotent producer enabled (enable.idempotence=true) Consumer: consumer group "map_rendering_navigation-processors" - At-least-once delivery + idempotent handlers (dedup by event_id) - DLQ topic: map_rendering_navigation-events-dlq (poison messages after 3 retries) - Lag alert: consumer lag > 60s → scale workers Design a Map Rendering and Navigation System like Google Maps: async side effects MUST NOT block the synchronous API response. Sync path: validate → persist source of truth → publish event → return 201 Async path: consumers update caches, indexes, notifications, aggregates
Get Vector Map Tiles
GET /api/v1/tiles/{z}/{x}/{y}.mvt
Headers:
Accept: application/vnd.mapbox-vector-tile
Cache-Control: public, max-age=86400
Response:
Binary Protobuf (MVT vector tile payload)Geocoding (Address to Coordinates)
GET /api/v1/geocode?address=1600+Amphitheatre+Parkway+Mountain+View
Response: 200 OK
{
"results": [{
"formatted_address": "1600 Amphitheatre Pkwy, Mountain View, CA 94043",
"geometry": { "lat": 37.4220, "lng": -122.0841 },
"place_id": "ChIJj61dQgK6j4AR4GeTYWZsKWw"
}]
}Get Route Path
POST /api/v1/routes
{
"origin": { "lat": 37.7749, "lng": -122.4194 },
"destination": { "lat": 37.3382, "lng": -121.8863 },
"mode": "driving",
"avoid_tolls": true
}
Response: 200 OK
{
"routes": [{
"route_id": "r-9921",
"distance_meters": 72400,
"duration_seconds": 3420,
"duration_in_traffic_seconds": 4200,
"encoded_polyline": "_p~iF~ps|U_ulLnnqC_mqNvxq`@",
"maneuvers": [
{
"instruction": "Turn left onto Market St",
"distance_meters": 800,
"duration_seconds": 120
}
]
}]
}Common Error Responses
400 Bad Request: invalid input, missing fields, or malformed JSON 401 Unauthorized: missing or invalid auth token or API key 403 Forbidden: authenticated but insufficient permissions 404 Not Found: resource ID does not exist 409 Conflict: duplicate write or version conflict; retry with idempotency key 422 Unprocessable Entity: valid syntax but invalid business logic 429 Too Many Requests: rate limit exceeded; honor Retry-After header 500 Internal Error: unexpected server fault; retry with idempotency key 503 Service Unavailable: dependency down or overloaded; use exponential backoff
PostgreSQL + PostGIS Schema (Road Graph Source of Truth)
-- Road segments model edges
CREATE TABLE road_segments (
segment_id BIGINT PRIMARY KEY,
start_node_id BIGINT NOT NULL,
end_node_id BIGINT NOT NULL,
road_name TEXT,
road_class VARCHAR(30),
length_meters FLOAT,
speed_limit_kmh INT,
one_way BOOLEAN DEFAULT FALSE,
toll BOOLEAN DEFAULT FALSE,
geometry GEOMETRY(LineString, 4326),
country_code CHAR(2)
);
CREATE INDEX idx_road_segments_geom ON road_segments USING GIST(geometry);
-- Nodes represent junctions
CREATE TABLE road_nodes (
node_id BIGINT PRIMARY KEY,
lat DECIMAL(10,7),
lng DECIMAL(10,7),
geometry GEOMETRY(Point, 4326)
);
CREATE INDEX idx_road_nodes_geom ON road_nodes USING GIST(geometry);Redis Transient Caching Structure
# Segment Congestion speeds (stale in 120s)
Key: traffic:{segment_id} -> Hash { avg_speed, congestion_level, updated_at }
# Popular geocoding locations (stale in 24 hours)
Key: geocode:{address_hash} -> JSON { lat, lng, formatted_address }Routing Engine In-Memory Graph Memory Cost
To maintain sub-millisecond route checks, the entire global road graph is hosted in RAM across region shards:
- Nodes: 2 Billion × 16 Bytes (ID, coordinates, level) = 32 GB.
- Edges: 5 Billion × 32 Bytes (from, to, weight, attributes) = 160 GB.
- CH Shortcuts: ~3 Billion × 32 Bytes = 96 GB.
- Total Memory Footprint: ~288 GB (sharded into 20 global region servers of ~50-150 GB each).
| Scenario Concern | System Solution Design |
|---|---|
| Tile Service Ingestion Failure | CDNs absorb > 95% of traffic. Underlying render servers are decoupled via origin shields to prevent thundering herd. |
| Routing Server Crash | Routing microservices are completely stateless. Replicas partition graph data and use health check ALB failovers. |
| Live Traffic Outage | If Flink streaming fails, fall back to historical speeds mapped to the specific weekday and hour of the day. |
| GPS Connection Drop (e.g. Tunnel) | Client-side dead reckoning tracks movement using internal accelerometer and gyroscope heading sensors. |
| Cross-Region Routing Splice | Border gateway nodes are replicated across regional memory shards; routing combines local paths with border matrices. |
Tunnel Outage Recovery Protocol ⭐
When a user enters a prolonged signal dead-zone (e.g., driving through a 5-minute mountain tunnel):
- The mobile client holds the pre-computed maneuvers and route polyline in local memory.
- The GPS receiver tracks coordinate changes without cellular connectivity.
- Guidance calculations run on the client, updating ETA locally and triggering voice warnings.
- Once connection returns, buffered coordinates are sent back to Flink to refine dynamic traffic segments.
1. Polyline Delta Compression ⭐
Sending raw arrays of floats for a long route (e.g. 5,000 lat/lng coordinates from SF to LA) would consume ~200 KB. Google's Encoded Polyline Algorithm uses delta compression (only encoding small variations from the previous point) and base64-like character sets. This reduces the route coordinates footprint to ~15 KB (a 13× reduction), saving massive mobile data.
2. Offline Map Bounding-Box Downloads
When a user downloads an offline city pack (e.g., "San Francisco" region):
- Vector Tiles: Pre-fetched zoom 0–16 MVT blocks for the bounding box (~500 MB).
- Local Graph: Highly simplified local road graph with contraction hierarchies (~140 MB).
- POI Search: Local SQLite-indexed places directory containing categories (~50 MB).
- Total Weight: ~700 MB. Delivers full search and navigation offline, pulling delta updates weekly.
Interview Walkthrough
- Separate tile serving (CDN-cached vector MVT pyramids) from routing (graph-based pathfinding) — they have different latency and storage profiles.
- Serve map tiles as vector MVT bundles at zoom levels 0–18, cached at CDN edge — clients render locally, saving bandwidth vs raster PNGs.
- Preprocess road networks with Contraction Hierarchies offline so A* queries on mobile complete in <50ms on a simplified local graph.
- Compress route polylines with Google's encoded polyline algorithm — 5000 coordinates shrink from ~200 KB to ~15 KB (13× reduction).
- Support offline city packs: pre-fetch zoom 0–16 tiles + simplified road graph + local POI SQLite (~700 MB per city).
- Apply tile cache invalidation by versioning tile sets — push delta updates weekly rather than re-downloading full packs.
- Quantify tile storage: zoom 0–18 global coverage ≈ billions of tiles — only pre-render and cache tiles within active viewport + buffer.
- Common pitfall: serving raster PNG tiles at high zoom levels — storage and bandwidth scale exponentially and mobile rendering becomes sluggish.
1. Map Tile Pre-Rendering vs On-Demand
Pre-rendering all 70B street-level tiles would require ~700 TB of storage and weeks of compute. Any minor map correction (e.g., updating a park layout) would require costly re-rendering. The system chooses a **hybrid** approach: pre-render global zoom levels 0–12 and highly popular metropolitan centers (covering 90% of views). Everything else is rendered on-demand and cached with dynamic TTL invalidation tags.
2. In-Memory Routing vs SQL pgRouting
Using SQL extensions like pgRouting is simple but performs disk-bound calculations, bottlenecking under load. Creating a specialized C++ in-memory routing service (like OSRM or Valhalla) requires massive sharded RAM clusters (~288 GB), but guarantees sub-millisecond continental queries, fully justifying the infrastructure cost.
Staff interviews expect you to articulate how the system evolves under real growth — not jump straight to the final architecture.
Phase 1: MVP (0 to 100K users)
Monolith or minimal services proving core map rendering navigation flows. Optimize for shipping speed and correctness over scale.
Key components: Single region · Primary DB + Redis cache · Synchronous core path · Basic monitoring
Move to next phase when: p99 latency exceeds SLO or DB CPU sustained above 70%
Phase 2: Growth (100K to 10M users)
Split read/write paths, introduce async processing for non-critical work, add caching layers and horizontal scaling.
Key components: Read replicas or CQRS · Message queue for async work · CDN / edge caching · Service-level SLOs
Move to next phase when: Hot keys, fan-out bottlenecks, or ops toil from manual scaling
Phase 3: Scale (10M+ users)
Shard data plane, multi-region active-active or active-passive, formal DR runbooks, cost optimization.
Key components: Database sharding / partitioning · Multi-region replication · Auto-scaling + chaos testing · Dedicated platform/SRE ownership
Move to next phase when: Regional failure domain risk, compliance data residency, or linear cost growth unsustainable
SLOs & Error Budgets
| Metric | Target | Rationale |
|---|---|---|
| Core user-facing availability | 99.95% | Budget for planned maintenance + unplanned failures without user-visible outage. |
| p99 latency (critical path) | Problem-specific — state target early and tie to capacity math | Interview credibility comes from connecting SLO to architecture choices. |
| Error rate (5xx) | < 0.1% | Distinguishes transient blips from systemic failure requiring rollback. |
| Data durability | 99.999999999% (11 nines) for committed writes | Define which operations require fsync/quorum vs async replication. |
Incident Scenarios (2am reality)
| Scenario | How you detect | Mitigation |
|---|---|---|
| Primary database unavailable | Health check failures, connection pool exhaustion alerts, elevated 5xx | Failover to replica / promote standby; enable read-only degraded mode if writes impossible; queue writes if async path exists |
| Traffic spike (10× normal) | RPS anomaly alert, autoscaling lag, latency SLO burn rate | Rate limit non-critical endpoints; scale read path horizontally; pre-warm caches; shed load on expensive operations |
| Bad deploy causing elevated errors | Canary metric regression, error budget burn, deployment correlation | Automated rollback within 5 minutes; feature flag kill switch; maintain N-1 compatibility |
Cost Drivers (Staff lens)
- Egress bandwidth and CDN (often dominates media/data-heavy systems)
- Database storage + IOPS at scale (plan compaction, TTL, tiering)
- Compute for async pipelines (right-size workers, spot instances for batch)
- Managed service premiums vs operational headcount trade-off
Multi-Region & DR
Start single-region with cross-AZ redundancy. Add read replicas in secondary region for DR. Move to active-active only when latency SLO or data residency requires it — accept conflict resolution complexity explicitly.