Interview Prompt
Design Real-time Vehicle Tracking System.
Clarifying Questions (ask before designing)
| Question | Why it matters |
|---|---|
| What city scale — concurrent trips, drivers, location update rate? | Drives geo-index choice, matching QPS, and streaming ingestion throughput. |
| What scale should we design for — DAU, QPS, data volume? | Drives every capacity decision; shows structured thinking. |
| What are the read vs write patterns on the critical path? | Determines caching, DB choice, and replication topology. |
| What consistency and durability guarantees are required? | Separates strong-consistency paths from eventual ones — a senior differentiator. |
Scope
In scope
- High-frequency location ingestion
- Geospatial pub/sub
- Trajectory compression
- Map matching
- Historical replay
- Capacity estimation with shown math
Out of scope (state explicitly)
- Full payment processing (#24)
- Turn-by-turn map rendering (#54)
- Driver/rider identity verification and background checks
Assumptions
- Single metro / region unless interviewer asks for multi-city
- Mobile clients with intermittent connectivity — server is source of truth
- Managed geo + messaging infra (Kafka, Redis, RDS) is acceptable
These foundational concepts underpin the patterns used in this problem. Review them before deep-diving into component-level trade-offs.
- Live tracking: Display real-time location of vehicles on a map (fleet management, delivery tracking)
- Location ingestion: Ingest GPS coordinates from thousands/millions of vehicles at configurable intervals
- Trip tracking: Track active trips with start, waypoints, and end; record full route trail
- Geofence alerts: Trigger alerts when vehicles enter/exit defined zones
- Historical playback: Replay a vehicle's route over any past time period
- Speed/idle alerts: Detect speeding, excessive idling, harsh braking events
- Fleet dashboard: Real-time overview of entire fleet: active, idle, offline vehicles
- ETA for deliveries: Show customer the live position and ETA of their delivery
- Multi-tenant: Support multiple fleet operators on the same platform
- Real-time: Location updates visible on dashboard within 3 seconds of device report
- High Throughput: Handle 5M+ location updates/sec across all fleets
- Scalability: Support 10M+ simultaneously tracked vehicles
- Durability: No location data point lost (required for compliance, insurance, disputes)
- Low Latency: Map updates in < 3 seconds; dashboard aggregations in < 500 ms
- Availability: 99.99%
- Data Retention: Raw location trails retained for 90 days; aggregated data for 2+ years
- Bandwidth Efficiency: Minimize cellular data usage for vehicle trackers
| Metric | Calculation | Value |
|---|---|---|
| Active vehicles | Given (assumption documented in value) | 10M |
| Avg location update interval | Given (typical workload assumption) | 5 seconds |
| Location updates / sec | From Location updates / day ÷ 86400 (+ peak factor in value) | 2M |
| Location point size | Given (assumption documented in value) | 100 bytes |
| Raw data / day | 2M × 100B × 86400 | ~17 TB |
| Concurrent dashboard viewers | Given (peak load assumption) | 500K |
| WebSocket connections | Given (assumption documented in value) | 500K (dashboard) + 10M (vehicles) |
| Historical queries / sec | From Historical queries / day ÷ 86400 (+ peak factor in value) | 10K |
Connection Gateway: Handling 10M Persistent Connections
MQTT vs HTTP for vehicle GPS: HTTP has ~4 KB overhead per update × 12 updates/min × 10M vehicles ≈ 690 TB/day wasted bandwidth. MQTT ⭐: Persistent connection with ~20 bytes header + 100 bytes payload = 120 bytes per update. For 10M vehicles: ~21 TB/day (~34× less!). QoS 1 ensures at-least-once delivery. Built-in reconnection handling for tunnels/rural areas. Last Will Testament detects vehicle offline.
MQTT Broker Cluster (EMQX/VerneMQ): Each broker handles ~200K connections. 10M vehicles → 50 brokers. Clustered with shared subscriptions and session persistence.
Location Processor: Updating Latest Position in Redis
Flink streaming job consumes from Kafka "vehicle-locations": validates (lat/lng range, speed < 300 km/h), de-duplicates, enriches with fleet info from device registry, updates Redis (HSET with 5-min TTL), and publishes to Redis Pub/Sub channel for real-time dashboard updates.
Movement Status Detection: speed > 5 km/h → "moving"; speed < 5 km/h for < 5 min → "idle"; speed < 2 km/h for > 5 min → "parked"; no update for > 5 min → "offline" (TTL expiry).
Trail Writer: Persisting Location History
Storage choice: TimescaleDB for hot (90 days): time-based partitioning, 10-20× compression, SQL queries, PostGIS integration, continuous aggregates. ClickHouse for cold analytics (2+ years): daily summaries.
Batch writing: Flink accumulates updates → batch write to TimescaleDB every 5 seconds. 2M/sec × 5s = 10M rows per batch, sharded across 4 nodes = 2.5M rows per shard. TimescaleDB handles 2.5M bulk insert in < 1 second.
Dashboard: Real-Time Fleet View
Fleet manager connects via WebSocket → server subscribes to Redis Pub/Sub for the fleet. Initial load: SCAN Redis for all vehicles in fleet. Ongoing: Redis Pub/Sub pushes updates → server forwards to client. Server-side viewport filtering sends only vehicles in the current map bounds. Client-side clustering at low zoom levels (Supercluster library).
Event Bus Design (Kafka)
Topic: realtime_vehicle_tracking-events Partitions: 64 (scale consumers horizontally) Partition key: entity_id (user_id / order_id — preserves per-entity ordering) Retention: 7 days (compliance) or 24h (high-volume telemetry) Replication factor: 3, min.insync.replicas: 2 Producer: idempotent producer enabled (enable.idempotence=true) Consumer: consumer group "realtime_vehicle_tracking-processors" - At-least-once delivery + idempotent handlers (dedup by event_id) - DLQ topic: realtime_vehicle_tracking-events-dlq (poison messages after 3 retries) - Lag alert: consumer lag > 60s → scale workers Design a Real-time Vehicle Tracking System: async side effects MUST NOT block the synchronous API response. Sync path: validate → persist source of truth → publish event → return 201 Async path: consumers update caches, indexes, notifications, aggregates
Send Location Update (MQTT)
MQTT Topic: vehicles/{vehicle_id}/location QoS: 1
Payload (protobuf):
{
"vehicle_id": "v-uuid",
"lat": 37.7749, "lng": -122.4194,
"speed": 45.3, "heading": 180,
"timestamp": 1710400000,
"fuel_level": 0.65,
"engine_status": "on",
"events": ["harsh_brake"]
}Get Vehicle Current Location
GET /api/v1/vehicles/{vehicle_id}/location
Response: 200 OK
{
"vehicle_id": "v-uuid",
"lat": 37.7749, "lng": -122.4194,
"speed": 45.3, "heading": 180,
"status": "moving",
"last_updated": "2025-03-14T10:23:45Z",
"driver": {"name": "John", "phone": "+1..."}
}Get Vehicle Route History
GET /api/v1/vehicles/{vehicle_id}/trail?start=2025-03-14T08:00:00Z&end=2025-03-14T18:00:00Z&simplify=trueFleet Overview
GET /api/v1/fleets/{fleet_id}/overview
Response: 200 OK
{
"fleet_id": "f-uuid",
"total_vehicles": 5000,
"moving": 3200,
"idle": 800,
"parked": 700,
"offline": 300,
"alerts_active": 12
}Common Error Responses
400 Bad Request: invalid input, missing fields, or malformed JSON 401 Unauthorized: missing or invalid auth token or API key 403 Forbidden: authenticated but insufficient permissions 404 Not Found: resource ID does not exist 409 Conflict: duplicate write or version conflict; retry with idempotency key 422 Unprocessable Entity: valid syntax but invalid business logic 429 Too Many Requests: rate limit exceeded; honor Retry-After header 500 Internal Error: unexpected server fault; retry with idempotency key 503 Service Unavailable: dependency down or overloaded; use exponential backoff 440 Login Timeout: WebSocket session expired; reconnect required
Redis: Latest Vehicle State
vehicle:{vehicle_id} → Hash { lat, lng, speed, heading, status, fleet_id, driver_id, fuel_level, last_updated }
TTL: 300 (offline detection)
fleet_vehicles:{fleet_id} → SET of vehicle_ids
fleet_stats:{fleet_id}:moving → INT (atomic counter)
alerts:{vehicle_id} → LIST of active alert JSONsTimescaleDB: Location Trail (90 Days)
CREATE TABLE location_points (
vehicle_id UUID NOT NULL,
timestamp TIMESTAMPTZ NOT NULL,
lat DOUBLE PRECISION,
lng DOUBLE PRECISION,
speed REAL,
heading SMALLINT,
altitude REAL,
status TEXT,
odometer REAL,
fuel_level REAL,
events JSONB
);
SELECT create_hypertable('location_points', 'timestamp',
chunk_time_interval => INTERVAL '1 day',
partitioning_column => 'vehicle_id',
number_partitions => 4);
ALTER TABLE location_points SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'vehicle_id',
timescaledb.compress_orderby = 'timestamp DESC'
);Kafka Topics
Topic: vehicle-locations (512 partitions, 48h retention)
Key: vehicle_id Value: protobuf { lat, lng, speed, heading, timestamp, events[] }
Topic: vehicle-alerts (64 partitions)
Topic: vehicle-status-changes (64 partitions)| Concern | Solution |
|---|---|
| MQTT broker failure | Cluster with session handoff; QoS 1 ensures no message loss |
| Kafka lag | Consumer group rebalancing; auto-scale Flink parallelism |
| Redis failure | Redis Cluster (6+ nodes); AOF persistence |
| TimescaleDB write failure | Kafka retains data (48h); replay after DB recovery |
| Vehicle goes offline | TTL-based detection; alert fleet manager; last known position preserved |
| GPS drift/spoofing | Validate speed vs distance between points; reject impossible |
Handling Vehicle GPS Blackout (Tunnel)
Vehicle buffers timestamps with gaps during signal loss. When GPS recovers, sends batch with gap indicator. Server detects 5+ minute gap, marks status = "GPS_LOST" (not offline: MQTT may still be alive), shows icon with "?". When GPS resumes, interpolates the gap (straight-line or map-matched). Edge case: OBD-II fallback via cell tower triangulation (~200m accuracy).
Out-of-Order GPS Points
Redis: use Lua script to only update if incoming timestamp > stored timestamp. TimescaleDB: insert all points regardless of order, ORDER BY timestamp for chronological queries, ON CONFLICT DO NOTHING for idempotency. Flink: use event-time processing with watermarks for speed alerts.
Interview Walkthrough
- Frame the problem as a write-heavy telemetry pipeline: 2M location updates/sec must never block the vehicle's MQTT publish ACK.
- Walk through ingestion: vehicles → MQTT brokers → Kafka (partitioned by vehicle_id) → Flink for stream enrichment and geofence alerts.
- Explain the hot path: Redis stores latest position per vehicle with a Lua script that rejects out-of-order timestamps.
- Cover the read path: dashboard clients subscribe via WebSocket to a fan-out service that reads Redis, not TimescaleDB.
- Describe tiered historical storage — raw points for 24h, TimescaleDB compression for 90 days, ClickHouse aggregates for cold analytics.
- Mention GPS blackout handling: buffer-and-replay on reconnect, interpolate gaps, distinguish GPS_LOST from true offline.
- Common pitfall: writing every GPS point synchronously to PostgreSQL on the ingestion path — the database becomes the bottleneck at 2M writes/sec.
MQTT vs WebSocket vs gRPC for Vehicle Communication
| Protocol | Overhead | Best For |
|---|---|---|
| MQTT ⭐ | 2-4 bytes per message | IoT devices with constrained bandwidth |
| WebSocket | 2-14 bytes per frame | Browser-based dashboards |
| gRPC | HTTP/2 + protobuf | Service-to-service, mobile apps |
Decision: Vehicles → Server: MQTT. Dashboard → Server: WebSocket. Internal services: gRPC.
Location Storage: Raw Points vs Compressed Trails
Raw points: 17 TB/day.
Douglas-Peucker simplification: 50% reduction (highway: 70%, city: 30%).
Delta encoding: 10-15× compression on top of TimescaleDB.
Hybrid ⭐: Real-time (last 24h): raw points. Recent (1-90 days): TimescaleDB compression (10×). Cold (90+ days): ClickHouse aggregated segments (100×). Total: ~170 TB vs 1.5 PB uncompressed.
Scalability: 10M Vehicles at 5-Second Intervals
2M updates/sec: 50 MQTT broker nodes, 18 Kafka brokers (RF=3, 512 partitions), 25 Flink TaskManagers, 20 Redis shards (5 GB memory), 4 TimescaleDB shards, 10 dashboard WebSocket servers.
Staff interviews expect you to articulate how the system evolves under real growth — not jump straight to the final architecture.
Phase 1: MVP (0 to 100K users)
Monolith or minimal services proving core realtime vehicle tracking flows. Optimize for shipping speed and correctness over scale.
Key components: Single region · Primary DB + Redis cache · Synchronous core path · Basic monitoring
Move to next phase when: p99 latency exceeds SLO or DB CPU sustained above 70%
Phase 2: Growth (100K to 10M users)
Split read/write paths, introduce async processing for non-critical work, add caching layers and horizontal scaling.
Key components: Read replicas or CQRS · Message queue for async work · CDN / edge caching · Service-level SLOs
Move to next phase when: Hot keys, fan-out bottlenecks, or ops toil from manual scaling
Phase 3: Scale (10M+ users)
Shard data plane, multi-region active-active or active-passive, formal DR runbooks, cost optimization.
Key components: Database sharding / partitioning · Multi-region replication · Auto-scaling + chaos testing · Dedicated platform/SRE ownership
Move to next phase when: Regional failure domain risk, compliance data residency, or linear cost growth unsustainable
SLOs & Error Budgets
| Metric | Target | Rationale |
|---|---|---|
| Core user-facing availability | 99.95% | Budget for planned maintenance + unplanned failures without user-visible outage. |
| p99 latency (critical path) | Problem-specific — state target early and tie to capacity math | Interview credibility comes from connecting SLO to architecture choices. |
| Error rate (5xx) | < 0.1% | Distinguishes transient blips from systemic failure requiring rollback. |
| Data durability | 99.999999999% (11 nines) for committed writes | Define which operations require fsync/quorum vs async replication. |
Incident Scenarios (2am reality)
| Scenario | How you detect | Mitigation |
|---|---|---|
| Primary database unavailable | Health check failures, connection pool exhaustion alerts, elevated 5xx | Failover to replica / promote standby; enable read-only degraded mode if writes impossible; queue writes if async path exists |
| Traffic spike (10× normal) | RPS anomaly alert, autoscaling lag, latency SLO burn rate | Rate limit non-critical endpoints; scale read path horizontally; pre-warm caches; shed load on expensive operations |
| Bad deploy causing elevated errors | Canary metric regression, error budget burn, deployment correlation | Automated rollback within 5 minutes; feature flag kill switch; maintain N-1 compatibility |
Cost Drivers (Staff lens)
- Egress bandwidth and CDN (often dominates media/data-heavy systems)
- Database storage + IOPS at scale (plan compaction, TTL, tiering)
- Compute for async pipelines (right-size workers, spot instances for batch)
- Managed service premiums vs operational headcount trade-off
Multi-Region & DR
Start single-region with cross-AZ redundancy. Add read replicas in secondary region for DR. Move to active-active only when latency SLO or data residency requires it — accept conflict resolution complexity explicitly.