This problem appears in multiple sheets. Depth expectations increase as you progress:
Interview Prompt
Design Trending Topics System.
Clarifying Questions (ask before designing)
| Question | Why it matters |
|---|---|
| Which of these is highest priority: Sliding window counting, Exponential decay, Count-Min Sketch? | Forces scope negotiation — senior candidates trim before drawing boxes. |
| What scale should we design for — DAU, QPS, data volume? | Drives every capacity decision; shows structured thinking. |
| What are the read vs write patterns on the critical path? | Determines caching, DB choice, and replication topology. |
| What consistency and durability guarantees are required? | Separates strong-consistency paths from eventual ones — a senior differentiator. |
Scope
In scope
- Sliding window counting
- Exponential decay
- Count-Min Sketch
- Heavy hitters algorithm
- Spam/bot filtering
- Capacity estimation with shown math
Out of scope (state explicitly)
- Detailed frontend/UI pixel implementation
- Org structure, staffing, and hiring plan
Assumptions
- Clarify scale (DAU, QPS, data volume) for trending topics in the first 5 minutes
- Standard reliability target 99.9%–99.99% unless problem implies higher (payments, booking)
- Managed cloud services (RDS, S3, Kafka, Redis) are acceptable building blocks
These foundational concepts underpin the patterns used in this problem. Review them before deep-diving into component-level trade-offs.
- Detect trends: Identify topics/hashtags/keywords with abnormal surge in volume
- Real-time: Trends update every 1-5 minutes
- Ranking: Show Top 10/20/50 trending topics, ranked by velocity of growth
- Personalization: Regional/country-specific trends, optionally personalized
- Context: Show why it's trending ("50K tweets about #WorldCup")
- Trend lifecycle: Detect start, peak, and decay of trends
- Filter abuse: Prevent spam/manipulation from gaming trends
- Low Latency: Trend detection within 1-5 minutes of surge start
- High Throughput: Process 500K+ events/sec (tweets, posts, searches)
- Scalability: Handle billions of events/day across 100+ countries
- Accuracy: Distinguish genuine trends from spam/bot campaigns
- Availability: 99.99%
| Metric | Calculation | Value |
|---|---|---|
| Events / sec (tweets + searches + posts) | Derived from daily volume ÷ 86400 (+ peak factor) | 500K |
| Unique topics / hour | Given | 10M |
| Trending topics displayed | Given | Top 20 per region |
| Regions | Given | 200+ countries |
| Topic velocity window | Given | 5-minute sliding window |
| Storage for trend history | Given | 10 GB/day |
Velocity Over Volume: The Core Algorithm
Why velocity, not raw count?
"Taylor Swift" always has 500K mentions/hour → NOT trending (normal)
"#ObscureEvent" normally has 10/hour, today has 5000 → TRENDING!
Formula:
velocity = (V_current - V_baseline) / max(V_baseline, min_floor=10)
Example:
Topic A: V_current=200K, V_baseline=100K → velocity = 1.0 (2× normal)
Topic B: V_current=5000, V_baseline=10 → velocity = 499 (500× normal!)
Topic B ranks higher despite lower absolute count.
This surfaces genuinely surprising/newsworthy events.
Decay: After spike, V_baseline catches up → velocity drops naturally
Lifecycle: Emerging → Peak → Decaying → NormalSpam & Manipulation Detection
Signals computed per topic in the Flink pipeline: 1. Account age: >30% from accounts <7 days old → score ×0.2 2. Account diversity: >50% from <100 unique accounts → suppress 3. Content similarity (SimHash): >60% near-duplicate text → suppress 4. Velocity shape: organic = gradual ramp; bots = instant step function 5. Geographic anomaly: claimed region doesn't match account timezones
Baseline Management
Seasonality model: day-of-week × hour-of-day
Redis key: baseline:{topic}:{day_of_week}:{hour}
Updated weekly by Spark batch job:
avg(volume at this day+hour over past 4 weeks)
Example: "#MondayMotivation"
Monday 8am baseline: 50K (always spikes)
Tuesday 8am baseline: 2K
→ Monday 55K → velocity=0.1 → NOT trending (normal Monday)
→ Tuesday 20K → velocity=9.0 → TRENDING (unusual for Tuesday!)
Cold-start (new topic, no history): min_floor=10 as baselineEvent Bus Design (Kafka)
Topic: trending_topics-events Partitions: 64 (scale consumers horizontally) Partition key: entity_id (user_id / order_id — preserves per-entity ordering) Retention: 7 days (compliance) or 24h (high-volume telemetry) Replication factor: 3, min.insync.replicas: 2 Producer: idempotent producer enabled (enable.idempotence=true) Consumer: consumer group "trending_topics-processors" - At-least-once delivery + idempotent handlers (dedup by event_id) - DLQ topic: trending_topics-events-dlq (poison messages after 3 retries) - Lag alert: consumer lag > 60s → scale workers Design a Trending Topics System: async side effects MUST NOT block the synchronous API response. Sync path: validate → persist source of truth → publish event → return 201 Async path: consumers update caches, indexes, notifications, aggregates
Get Trending Topics
GET /api/v1/trends?region=US&count=20
→ 200 OK
{
"trends": [
{ "rank": 1, "topic": "#WorldCup2026", "volume": "2.3M",
"velocity": 15.2, "category": "Sports",
"started_at": "2026-03-14T08:00:00Z",
"context": "Quarter-finals underway" },
...
],
"as_of": "2026-03-14T10:05:00Z"
}Common Error Responses
400 Bad Request: invalid input, missing fields, or malformed JSON 401 Unauthorized: missing or invalid auth token or API key 403 Forbidden: authenticated but insufficient permissions 404 Not Found: resource ID does not exist 409 Conflict: duplicate write or version conflict; retry with idempotency key 422 Unprocessable Entity: valid syntax but invalid business logic 429 Too Many Requests: rate limit exceeded; honor Retry-After header 500 Internal Error: unexpected server fault; retry with idempotency key 503 Service Unavailable: dependency down or overloaded; use exponential backoff 504 Gateway Timeout: index shard slow; narrow query or retry
Redis
trending:{region} → Sorted Set (score=velocity, member=topic)
topic_ctx:{topic} → Hash {volume, started_at, category, samples}
baseline:{topic}:{day_of_week}:{hour} → Integer (rolling avg)Flink State
Per (topic, region): current_window_count: int ring_buffer[12]: int[] (last 12 five-min windows) last_emit_ts: long Checkpoint: S3, every 60s
ClickHouse
CREATE TABLE trend_snapshots (
region LowCardinality(String),
topic String,
velocity Float64,
volume UInt64,
rank UInt16,
snapshot_at DateTime
) ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(snapshot_at)
ORDER BY (region, snapshot_at, rank);| Concern | Solution |
|---|---|
| Flink failure | Checkpoint to S3 every 60s; resume from last checkpoint |
| Redis loss | Flink recomputes full top-K within 60 seconds |
| Kafka lag | Trends delayed proportionally; alert if lag > 2 min |
| Spam spike | Inline bot filter penalizes before scoring |
| Topic explosion | Only track topics with >100 mentions/hour; prune below |
Race Conditions
Window Boundary Splitting
Tumbling window: [10:00, 10:05), [10:05, 10:10) Spike at 10:04→10:06 → each window sees only HALF Fix: Sliding window (5 min wide, 1 min slide) ⭐ At T=10:06: window [10:01, 10:06] captures both halves Cost: 5× more state (5 overlapping windows). Worth it.
Hot Key Problem
Super Bowl → US region has 10× more events → consumer partition lags Fix: Sub-partition hot regions (US-east, US-west, US-central) Merge in a second Flink stage. Or: key by (region, topic_hash % 100) → 100 sub-partitions
Stale Trend: Topic That Should Have Decayed But Didn't
"#MondayMotivation" spikes every Monday morning (expected).
But if baseline hasn't been updated in 2 weeks (Spark job failed):
→ Old baseline is too low → appears as trending every Monday → wrong!
Detection: Monitor baseline freshness
If baseline is > 14 days old → use conservative fallback:
fallback_baseline = max(V_baseline, V_current × 0.5)
This prevents anything from appearing as more than 2× "normal"
Fix: Alert on Spark baseline job failures; auto-retry with backoffFlink Checkpoint Recovery: Events Replayed Twice
Flink checkpoints at T=10:00 (offset=1000) Processes events 1001-1500 Crashes at T=10:03 before next checkpoint Recovery: restart from offset 1000 → events 1001-1500 replayed! → Counts for topics in those events are DOUBLED Solution: Flink's exactly-once guarantee with Kafka transactions Flink writes to Kafka output topic + commits checkpoint atomically If crash → events re-consumed but duplicates detected via transaction For Redis output: idempotent ZADD (same score overwritten, not doubled) For counting: use Flink's managed state (restored from checkpoint, not Redis)
Trend Suppression: Sensitive Content
Breaking news: mass shooting → topic trends instantly
Problem: Surfacing this as "trending" feels exploitative
Content sensitivity pipeline:
1. Keyword blocklist: certain terms auto-suppressed from trending
2. Sentiment analysis: if topic sentiment is overwhelmingly negative
AND contains sensitive keywords → flag for review
3. Human review: on-call Trust & Safety team can manually suppress trends
4. Editorial override: trending list can be manually curated during crises
Twitter's approach: "Trending with context"
Instead of just "#shooting" → show "Mass shooting at X: law enforcement responding"
Adds responsible context rather than raw hashtagInterview Walkthrough
- Lead with velocity-over-volume: a topic trending means recent surge relative to its baseline, not highest absolute tweet count.
- Ingest events (tweets, searches, posts) into Kafka partitioned by topic hash for ordered per-topic counting.
- Compute scores in Flink with tumbling windows and watermarks — allow 2–5 minutes of late events before sealing a window.
- Use Count-Min Sketch for approximate per-topic counts at 500K events/sec — exact HashMap counting exhausts memory.
- Apply spam filters: bot account age, coordinated posting patterns, and velocity anomalies that spike faster than organic growth.
- Segment trends by geography and category — a global hashtag and a local news event need separate leaderboards.
- Handle hot keys by sharding high-volume topics across sub-partitions and merging counts in a downstream aggregator.
- Common pitfall: ranking by lifetime total volume — evergreen topics like #Music always win and genuine breaking news never surfaces.
Exact Counting vs Count-Min Sketch
Exact HashMap: O(N) memory, N=10M unique topics → 2 GB per worker Count-Min Sketch: Fixed 200 KB, approximate (overestimates only) Hybrid "Sketch + Promote" ⭐: 1. CMS for ALL topics (200 KB) 2. If CMS estimate > threshold → promote to exact HashMap 3. Only ~10K promoted topics tracked exactly 4. Top-K from exact counters → accurate ranking Memory: 2.2 MB total vs 2 GB → 1000× reduction
Flink vs Spark Streaming vs Kafka Streams
Feature Flink ⭐ Spark Streaming Kafka Streams
────────────────────────────────────────────────────────────────
Latency True streaming Micro-batch (seconds) True streaming (ms)
(ms)
Windowing Rich (sliding, Basic Basic
session)
State RocksDB, Spark state RocksDB
checkpointed
Best for Complex event Batch+stream hybrid Simple pipelines
processing
Choose Flink: needs sliding windows, low latency, complex per-key state.Personalized Trends
Standard trends: Top 20 for a region (same for everyone in US)
Personalized trends: Mix in topics relevant to the user's interests
Implementation:
User interests = topics they follow + topics they engage with
For each trending topic:
relevance = cosine_similarity(topic_embedding, user_interest_embedding)
Final_rank = w1 × velocity + w2 × relevance
Example: User follows #Tech and #AI
#WorldCup trending (velocity=15, relevance=0.1) → final=1.6
#GPT5Release trending (velocity=5, relevance=0.9) → final=2.3
→ User sees #GPT5Release higher (personalized!)
Pre-computed: user interest embeddings updated daily via Spark
Real-time: re-rank top 50 regional trends using user embedding (< 10ms)Trend Category Auto-Classification
Categorize trending topics automatically: Sports, Politics, Entertainment, Tech, etc.
Approach:
1. Named Entity Recognition (NER): "Taylor Swift" → entity type = PERSON
2. Topic model: LDA / BERTopic on sample tweets mentioning the topic
3. Knowledge graph lookup: "Taylor Swift" → Wikidata → category = musician
4. Hashtag taxonomy: #WorldCup → manually curated mapping → Sports
Fallback: If no category detected → "Trending" (uncategorized)
Used for: filtering ("show only Sports trends"), display contextMulti-Language Trend Merging
Same event trending in multiple languages: "#CopaDelMundo" (Spanish), "#WorldCup" (English), "#ワールドカップ" (Japanese) Should these merge into ONE trend or stay separate? Approach: Entity-based merging 1. NER extracts entities: all three map to entity "FIFA World Cup 2026" 2. Merge counts across language variants 3. Display: use the user's language variant but aggregate volume Without merging: Same event appears 3 times in trending → redundant With merging: One entry showing "2.3M tweets" (across all languages) → cleaner Implementation: entity_id as the trend key, not the raw hashtag text Flink: group by entity_id (resolved via a pre-built entity dictionary)
Staff interviews expect you to articulate how the system evolves under real growth — not jump straight to the final architecture.
Phase 1: MVP (0 to 100K users)
Monolith or minimal services proving core trending topics flows. Optimize for shipping speed and correctness over scale.
Key components: Single region · Primary DB + Redis cache · Synchronous core path · Basic monitoring
Move to next phase when: p99 latency exceeds SLO or DB CPU sustained above 70%
Phase 2: Growth (100K to 10M users)
Split read/write paths, introduce async processing for non-critical work, add caching layers and horizontal scaling.
Key components: Read replicas or CQRS · Message queue for async work · CDN / edge caching · Service-level SLOs
Move to next phase when: Hot keys, fan-out bottlenecks, or ops toil from manual scaling
Phase 3: Scale (10M+ users)
Shard data plane, multi-region active-active or active-passive, formal DR runbooks, cost optimization.
Key components: Database sharding / partitioning · Multi-region replication · Auto-scaling + chaos testing · Dedicated platform/SRE ownership
Move to next phase when: Regional failure domain risk, compliance data residency, or linear cost growth unsustainable
SLOs & Error Budgets
| Metric | Target | Rationale |
|---|---|---|
| Core user-facing availability | 99.95% | Budget for planned maintenance + unplanned failures without user-visible outage. |
| p99 latency (critical path) | Problem-specific — state target early and tie to capacity math | Interview credibility comes from connecting SLO to architecture choices. |
| Error rate (5xx) | < 0.1% | Distinguishes transient blips from systemic failure requiring rollback. |
| Data durability | 99.999999999% (11 nines) for committed writes | Define which operations require fsync/quorum vs async replication. |
Incident Scenarios (2am reality)
| Scenario | How you detect | Mitigation |
|---|---|---|
| Primary database unavailable | Health check failures, connection pool exhaustion alerts, elevated 5xx | Failover to replica / promote standby; enable read-only degraded mode if writes impossible; queue writes if async path exists |
| Traffic spike (10× normal) | RPS anomaly alert, autoscaling lag, latency SLO burn rate | Rate limit non-critical endpoints; scale read path horizontally; pre-warm caches; shed load on expensive operations |
| Bad deploy causing elevated errors | Canary metric regression, error budget burn, deployment correlation | Automated rollback within 5 minutes; feature flag kill switch; maintain N-1 compatibility |
Cost Drivers (Staff lens)
- Egress bandwidth and CDN (often dominates media/data-heavy systems)
- Database storage + IOPS at scale (plan compaction, TTL, tiering)
- Compute for async pipelines (right-size workers, spot instances for batch)
- Managed service premiums vs operational headcount trade-off
Multi-Region & DR
Start single-region with cross-AZ redundancy. Add read replicas in secondary region for DR. Move to active-active only when latency SLO or data residency requires it — accept conflict resolution complexity explicitly.