Recently, I’ve been diving into system design, what makes systems scalable, reliable, and fault-tolerant. Beside reading books, I wanted to actually build something real. That’s when I decided to take on a pretty ambitious project to building my own Kafka.
I chose Kafka because it’s a relatively modern data system designed for today’s event-driven architectures. It gives me a chance to learn in-depth about real-time stream processing and distributed system.
On the internet I found many people made their own Kafka too, but all of them are toy version or a stripped-down clone (for example all of them are single node Kafka). It’s can be done fast and easy, however my goal is to make something that’s as close to the real Kafka as possible.
Here are rough milestones (unordered):
Basically, I want to understand Kafka from the inside out, how each part fits together, where the trade-offs are, and what really makes it scale.
For me, this is about learning by doing. Rebuilding something as mature as Kafka forces me to think deeply about distributed systems design, consistency, and real-world performance trade-offs.
I choose the name of this project to be Khronikle which is Kafka + chronicle (a written record of events in the order in which they happened - Oxford dictionary)