System Design Problem

Design a Distributed Metrics Aggregation System

Commonly Asked By:DatadogUberNetflixGoogle

  • Ingest metrics: Receive time-series metrics from thousands of hosts
  • Aggregate: Sum, avg, p50/p95/p99, min, max across dimensions
  • Downsample: Auto-downsample old data (1s → 1m → 1h → 1d)
  • Query: "Average CPU across region=us-east for last 6 hours at 1-minute granularity"
  • Alerting integration: Feed metrics to alerting rules
  • Dashboarding: Low-latency queries for Grafana-style dashboards
Loading...