System Design Problem

Design a Distributed Message Broker (Kafka-style)

Commonly Asked By:LinkedInConfluentUberNetflix

  • Publish: Producers publish messages to named topics.
  • Subscribe: Consumers subscribe to topics and receive messages in strict partition-level order.
  • Persistence: Messages durably stored on disk for a configurable retention period (time or size-based).
  • Consumer Groups: Multiple consumers in a group share processing load; each message goes to exactly one group member.
  • Ordering: Messages within a partition are strictly ordered (FIFO) via monotonically increasing offsets.
  • Delivery Semantics: Support at-least-once, at-most-once, and exactly-once delivery guarantees.
  • Replay: Consumers can re-read historic messages by seeking to any offset or timestamp.
  • Partitioning: Topics split into partitions for parallelism, load distribution, and horizontal scaling.
  • Log Compaction: Retain only the latest value per key, essential for changelogs and CDC use cases.
  • Schema Evolution: Support schema validation and evolutionary compatibility checks via an external Schema Registry.

The Kafka HLD decouples producers, brokers, and consumers. Producers bypass API gateways and publish directly to partition leaders. Metadata is managed via KRaft (Kafka Raft consensus) instead of ZooKeeper.

Loading...