Design Live Likes & Reactions

This problem appears in multiple sheets. Depth expectations increase as you progress:

Track	What to demonstrate
Arch 75	Staff level: multi-region, cost at scale, migration path, and production metrics.

Interview Prompt

Design Design Live Likes & Reactions.

Clarifying Questions (ask before designing)

Question	Why it matters
Which of these is highest priority: High-throughput write aggregation, Buffered counters, WebSocket broadcast?	Forces scope negotiation — senior candidates trim before drawing boxes.
What scale should we design for — DAU, QPS, data volume?	Drives every capacity decision; shows structured thinking.
What are the read vs write patterns on the critical path?	Determines caching, DB choice, and replication topology.
What consistency and durability guarantees are required?	Separates strong-consistency paths from eventual ones — a senior differentiator.

Scope

In scope

High-throughput write aggregation
Buffered counters
WebSocket broadcast
Animation sync
Capacity estimation with shown math

Out of scope (state explicitly)

Full ML ranking model training pipeline
Direct messaging / chat (#07)
Ad insertion and monetization

Assumptions

Clarify scale (DAU, QPS, data volume) for live likes reactions in the first 5 minutes
Standard reliability target 99.9%–99.99% unless problem implies higher (payments, booking)
Managed cloud services (RDS, S3, Kafka, Redis) are acceptable building blocks

React to content: Allow users to emit live reactions (❤️ 😂 😮 😢 😡) to posts, videos, or active live streams.
Real-time count: Display reaction counts that update instantly for all active viewers.
Reaction animations: Render floating reaction emojis on live streaming screens in real-time.
Toggle reactions: Allow users to change or withdraw their reaction instantly.
Aggregate totals: Show the cumulative total counts grouped per reaction type (e.g., 1.2K ❤️, 300 😂).
Deduplication: Restrict every user to exactly one reaction per content item.

Metric	Calculation	Value
Peak Reactions / sec (viral live event)	From Peak Reactions / day ÷ 86400 (+ peak factor in value)	1,000,000
Avg Reactions / sec	From Avg Reactions / day ÷ 86400 (+ peak factor in value)	50,000
Reaction Record Size	Derived	50 bytes (user_id + content_id + type)
Write Throughput (Peak)	Given (peak load assumption)	50 MB/s
Active Live Events	Derived	10,000
Reactions per Event (Avg)	Given (typical workload assumption)	100,000

I/O and Bandwidth Calculations:
- Ingestion Bandwidth (Peak): 1,000,000 reactions/sec × 50 bytes = 50 MB/s incoming bandwidth.
- Daily Aggregate Counts:
  - 10,000 active events × 100,000 reactions = 1 Billion reactions recorded per day.
  - 1 Billion × 50 bytes = 50 GB storage space needed per day (raw persistence).

Reactions are submitted via a fast POST path to stateless Reaction Services which validate and record events in memory. Simultaneously, events stream into a Kafka partition pipeline. Reaction Aggregators batch and publish updates to Redis Pub/Subevery 500ms. Stateful WebSocket Gateways receive updates and broadcast batched delta animations to all viewers.

Loading...

1. The Hot Counter Aggregation Path ⭐

Under viral conditions, issuing direct database updates (e.g., UPDATE counters SET count = count + 1) will lock rows and immediately crash the database. We implement a hybrid, tiered storage solution:

Layer 1 — Redis (real-time, approximate):
  HINCRBY reaction_count:{content_id} heart 1
  HINCRBY reaction_count:{content_id} laugh 1
  
  Reads: HGETALL reaction_count:{content_id} → {heart: 15234, laugh: 3421}
  - Redis handles 1M+ INCR/sec easily (single-threaded atomic operations in memory).

Layer 2 — Cassandra (persistent, exact):
  Kafka consumer batches reactions every 5 seconds.
  Batch write to Cassandra:
    INSERT INTO reactions (content_id, user_id, type, timestamp) ...
  Update aggregate counts:
    UPDATE reaction_counts SET count = count + batch_size WHERE content_id = ?;
  
  5-second eventual consistency is completely invisible to users.

2. Deduplication Mechanisms

To maintain the "one reaction per user per content" rule, the system checks and dedups writes at two distinct levels:

Fast dedup check in Redis:
  SADD reacted:{content_id} {user_id}
  If SADD returns 0 → already reacted → update type (toggle) instead of increment.

Persistent dedup in Cassandra:
  PRIMARY KEY (content_id, user_id) → natural dedup on upsert.

3. Real-Time Broadcast & Batching

Broadcasting 1M individual messages/sec to WebSocket connections will overload client browsers and network pipes. We batch updates using delta frames:

For live streams with millions of viewers, don't broadcast individual reactions:
- Every 500ms, aggregate reactions received in that window:
    { "heart": +342, "laugh": +89, "wow": +23 }
- Broadcast ONE message with deltas to all WebSocket clients.
- Client-side: Smoothly animate counters incrementing by the delta, and trigger a random sample of 5 floating emojis (rather than rendering all 342).

Event Bus Design (Kafka)

Topic: live_likes_reactions-events
  Partitions: 64 (scale consumers horizontally)
  Partition key: entity_id (user_id / order_id — preserves per-entity ordering)
  Retention: 7 days (compliance) or 24h (high-volume telemetry)
  Replication factor: 3, min.insync.replicas: 2

Producer: idempotent producer enabled (enable.idempotence=true)
Consumer: consumer group "live_likes_reactions-processors"
  - At-least-once delivery + idempotent handlers (dedup by event_id)
  - DLQ topic: live_likes_reactions-events-dlq (poison messages after 3 retries)
  - Lag alert: consumer lag > 60s → scale workers

Design Live Likes & Reactions: async side effects MUST NOT block the synchronous API response.
  Sync path: validate → persist source of truth → publish event → return 201
  Async path: consumers update caches, indexes, notifications, aggregates

1. React to Content

HTTP

POST /api/v1/reactions
{
  "content_id": "post-123",
  "type": "heart"
}
→ Response: 200 OK
{
  "current_counts": { "heart": 15235, "laugh": 3421 }
}

2. Remove Reaction

HTTP

DELETE /api/v1/reactions?content_id=post-123
→ Response: 200 OK

3. Retrieve Reaction Counts

HTTP

GET /api/v1/reactions/counts?content_id=post-123
→ Response: 200 OK
{
  "heart": 15234,
  "laugh": 3421,
  "wow": 892,
  "total": 19547
}

4. WebSocket Live Updates

JSON

// Server pushes deltas over WebSocket every 500ms:
{
  "type": "reaction_update",
  "content_id": "live-456", 
  "deltas": { "heart": 342, "laugh": 89 },
  "total": 1523400
}

Common Error Responses

400 Bad Request: invalid input, missing fields, or malformed JSON
401 Unauthorized: missing or invalid auth token or API key
403 Forbidden: authenticated but insufficient permissions
404 Not Found: resource ID does not exist
409 Conflict: duplicate write or version conflict; retry with idempotency key
422 Unprocessable Entity: valid syntax but invalid business logic
429 Too Many Requests: rate limit exceeded; honor Retry-After header
500 Internal Error: unexpected server fault; retry with idempotency key
503 Service Unavailable: dependency down or overloaded; use exponential backoff

Redis Structure

# Real-time reaction counts stored in a Hash structure
Key:   reaction_count:{content_id}
Type:  Hash
Fields: { "heart": 15234, "laugh": 3421, "angry": 88 }

# Dedup tracking set
Key:   reacted:{content_id}
Type:  Set (or Bloom Filter)
TTL:   24 hours for live streams

Cassandra Schema Design

SQL

-- Individual user reactions for persistence and dedup
CREATE TABLE user_reactions (
    content_id  UUID,
    user_id     UUID,
    type        TEXT,       -- heart, laugh, wow, sad, angry
    created_at  TIMESTAMP,
    PRIMARY KEY (content_id, user_id)
);

-- Persisted aggregate reaction counts
CREATE TABLE reaction_counts (
    content_id  UUID PRIMARY KEY,
    heart       COUNTER,
    laugh       COUNTER,
    wow         COUNTER,
    sad         COUNTER,
    angry       COUNTER,
    total       COUNTER
);

Concern Scenario	Mitigation Strategy
Redis Cluster Node Outage	Redis Cluster employs primary-standby failover. If memory is lost, counts can be completely reconstructed by executing aggregate count jobs on Cassandra.
Double Counting / Retries	Checked atomically using SADD in Redis. Cassandra inserts are fully idempotent since primary keys naturally overwrite duplicate rows.
Kafka Consumer Backlog	Persistent database records may experience brief lag, but real-time client traffic continues uninterrupted via Redis memory.

1. Toggle Reaction Race Conditions ⭐

If a user rapidly taps Like, Unlike, and Like again, out-of-order packet delivery could corrupt the count. We leverage Redis set idempotency to defend the state:

User rapidly clicks: Like → Unlike → Like (within 100ms)

Expected arrival order:
  T=0ms:  LIKE received → SADD reacted:{cid} user-1 → returns 1 → HINCRBY heart +1
  T=30ms: UNLIKE received → SREM reacted:{cid} user-1 → returns 1 → HINCRBY heart -1
  T=60ms: LIKE received → SADD reacted:{cid} user-1 → returns 1 → HINCRBY heart +1
  Result: heart count is +1 (correct)

What if network delays reverse arrival order?
  T=0ms:  UNLIKE arrives first → SREM user-1 → returns 0 (not in set) → no decrement
  T=30ms: LIKE arrives → SADD user-1 → returns 1 → HINCRBY heart +1
  T=60ms: LIKE (second) arrives → SADD user-1 → returns 0 (already in) → no increment
  Result: heart count is +1. Redis SET structures resolve idempotency naturally.

2. Viral Hot Keys (1M+ events on a single stream) ⭐

When 1M users react to a single stream, all traffic hits a single Redis node partition. We prevent node saturation:

Problem: content_id "super-bowl-2026" → 1M reactions/sec hitting ONE Redis node.

Solutions:
1. In-Memory In-Process Aggregation (Reaction Service):
   - Buffer reactions in service memory for 100ms.
   - Send a single batch HINCRBY with cumulative counts (e.g. +342).
   - Reduces 1M write commands to ~200 Redis ops/sec across 20 nodes.

2. Sharded Sub-Counters:
   - Split single count key into N sharded keys:
     reaction_count:{cid}:0, reaction_count:{cid}:1, ..., reaction_count:{cid}:7
   - Query: SUM across all 8 sub-counters.

3. Local Redis Sidecars with Write-Behind:
   - Accept writes into local host Redis instances.
   - Background tasks sync accumulated deltas to the central cluster.

3. Redis Deduplication Set Memory Exhaustion ⭐

Storing 50M user IDs in a Redis set consumes ~800MB per event, causing rapid cluster memory exhaustion. We solve this memory bottleneck:

Problem: SADD reacted:{content_id} stores all user_ids.
- 50M unique users × 16 bytes (UUID) = 800 MB memory for one post.

Solutions:
1. Bloom Filters ⭐:
   - Use ~15 bits per element.
   - 50M elements × 15 bits = 93 MB (88% memory reduction).
   - Trade-off: ~1% false-positive rate (some clicks are dropped, which is fine for reactions).

2. TTL cleanup:
   - Set TTL to 24h. Old historical events fall back to Cassandra checks.

3. Split-Tier checking:
   - Use Redis dedup exclusively for active live streams. 
   - Static post reactions run queries directly against Cassandra.

1. Write Path Tiering (Speed vs Durability)

Writing every transaction synchronously to disk at 1M/sec is structurally impossible. We trade off absolute synchronous durability for highly responsive in-memory writes. If a collector crash occurs, up to 5 seconds of transient counts might be lost, but the system stays fast and operational.

2. Counter Implementation Architectures

Approach	Latency	Accuracy	Durability	Best For
Redis HINCRBY ⭐	< 1 ms	Exact in Redis memory	Volatile (syncs periodically)	Real-time viewer display updates
Cassandra COUNTER	~5 ms	Exact	Highly durable	Persistent aggregate history
Flink window aggregation	1-5 sec window	Exact within window	Highly durable output	Complex analytics (region, time)
HyperLogLog (HLL)	< 1 ms	~0.81% error rate	Volatile	Unique reactor count (approximate)

SLOs & Error Budgets

Metric	Target	Rationale
Core user-facing availability	99.95%	Budget for planned maintenance + unplanned failures without user-visible outage.
p99 latency (critical path)	Problem-specific — state target early and tie to capacity math	Interview credibility comes from connecting SLO to architecture choices.
Error rate (5xx)	< 0.1%	Distinguishes transient blips from systemic failure requiring rollback.
Data durability	99.999999999% (11 nines) for committed writes	Define which operations require fsync/quorum vs async replication.

Incident Scenarios (2am reality)

Scenario	How you detect	Mitigation
Primary database unavailable	Health check failures, connection pool exhaustion alerts, elevated 5xx	Failover to replica / promote standby; enable read-only degraded mode if writes impossible; queue writes if async path exists
Traffic spike (10× normal)	RPS anomaly alert, autoscaling lag, latency SLO burn rate	Rate limit non-critical endpoints; scale read path horizontally; pre-warm caches; shed load on expensive operations
Bad deploy causing elevated errors	Canary metric regression, error budget burn, deployment correlation	Automated rollback within 5 minutes; feature flag kill switch; maintain N-1 compatibility

Cost Drivers (Staff lens)

Egress bandwidth and CDN (often dominates media/data-heavy systems)
Database storage + IOPS at scale (plan compaction, TTL, tiering)
Compute for async pipelines (right-size workers, spot instances for batch)
Managed service premiums vs operational headcount trade-off

Multi-Region & DR

Start single-region with cross-AZ redundancy. Add read replicas in secondary region for DR. Move to active-active only when latency SLO or data residency requires it — accept conflict resolution complexity explicitly.

Interview Prompt

Clarifying Questions (ask before designing)

Scope

In scope

Out of scope (state explicitly)

Assumptions

1. The Hot Counter Aggregation Path ⭐

2. Deduplication Mechanisms

3. Real-Time Broadcast & Batching

Event Bus Design (Kafka)

1. React to Content

2. Remove Reaction

3. Retrieve Reaction Counts

4. WebSocket Live Updates

Common Error Responses

Redis Structure

Cassandra Schema Design

1. Toggle Reaction Race Conditions ⭐

2. Viral Hot Keys (1M+ events on a single stream) ⭐

3. Redis Deduplication Set Memory Exhaustion ⭐

Count Display Formatting & User Perception

Interview Walkthrough

1. Write Path Tiering (Speed vs Durability)

2. Counter Implementation Architectures

Phase 1: MVP (0 to 100K users)

Phase 2: Growth (100K to 10M users)

Phase 3: Scale (10M+ users)

SLOs & Error Budgets

Incident Scenarios (2am reality)

Cost Drivers (Staff lens)

Multi-Region & DR