drio


Kafka is a powerful piece of software that enables very useful architectural patterns. In this post I want to capture my understanding so I can reference it in the future. I will keep updating the post as I acquire more knowledge or discover refinements and errors.

Kafka is a distributed log messaging system. It has a set of components that work in parallel to perform the task of writing and reading messages from partitions. That task has to be consistent, meaning that all clients interacting with Kafka—producers, brokers, and consumers—must have a coherent view of the system’s state.

Some vocabulary first

HA (High Availability) vs FT (Fault Tolerance)

In a Kafka system:

Replication factor gives you fault tolerance.
Number of partitions enables scalability and parallelism, not HA directly.

The functionality

To me, Kafka provides the following functionality. First, it helps decouple services—consumers and producers don’t talk directly, they use Kafka as a shared interface. Kafka is very performant: it can ingest large volumes of messages with low latency.

Kafka provides durable storage. Messages are written to disk and removed only based on retention policies. A byproduct of this is replayability: you can replay messages to consumers as many times as needed.

Kafka is scalable—you can increase throughput by adding partitions and consumers. It’s also fault tolerant: messages are persisted and replicated across machines. Kafka has an “ACK” mechanism that, when configured properly, offers strong delivery guarantees.

The main components

What is metadata?

To answer that, we need to talk about ZooKeeper or KRaft.

Metadata refers to the structure and state of the Kafka cluster: topics, partitions, configurations, broker registrations, and leader assignments.

Topic vs Partition vs Consumer Group

A topic is a logical channel where you publish messages. A partition is a subdivision of a topic—that’s where messages are stored. Partitions are immutable. Kafka spreads partitions across brokers to support fault tolerance and scalability.

A consumer group is a set of consumers working together to process a topic:

The purpose of ZooKeeper / KRaft

Machines can fail, networks can partition, and messages can arrive late or out of order. So we need a mechanism to ensure that even if something goes wrong, the cluster behaves in a predictable, coordinated way.

That’s where consensus comes in.

Kafka uses a leader-follower model for partitions. Each partition has:

Producers write to the partition leader. The leader appends to a write-ahead log, replicates to the in-sync replicas (ISRs), and then acknowledges the write.

Kafka guarantees consistency at the partition level: messages in a partition are strictly ordered and replicated in that order. Consumers read from the partition leader and track their position using offsets, which they can commit for fault-tolerant processing.

Kafka uses consensus (ZooKeeper or KRaft) to ensure that all brokers agree on who’s in charge of the cluster and who’s leading each partition. This coordination is what lets Kafka recover safely and continue working correctly when machines fail.

ZooKeeper is a separate system. Kafka brokers connect to any ZooKeeper node to get metadata. The ZooKeeper nodes maintain a consistent state by coordinating with each other via a consensus algorithm.

It is important how the znodes and the knodes (brokers) talk between each other:

Consumer groups

Kafka uses a group coordinator to assign partitions to consumers in a group. Each partition is assigned to only one consumer at a time.

When a consumer leaves or crashes, Kafka triggers a rebalance. The coordinator reassigns partitions among the remaining consumers.

Rebalance: A consumer leaves the group, and Kafka reassigns its partitions.

The acks setting

You set the acks option in the producer client library.

Use acks=all when data loss is unacceptable. For low latency and looser durability, use acks=1 or acks=0.

Can data still be lost?

Yes, in some edge cases:

  1. Unclean leader election

If the leader broker crashes and unclean.leader.election.enable=true, Kafka may elect an out-of-sync replica as leader. This can lead to data loss if the leader hadn’t replicated all recent writes.

If the setting is false (the default), Kafka waits until a proper in-sync replica is available.

  1. Message in memory not flushed to disk

If Kafka crashes before flushing data to disk, the message is lost.

  1. Low min.insync.replicas

If set too low, a write may be considered successful even if not all replicas acknowledged it. You want this to be at least 2 for safety.

  1. Disk corruption

Kafka may think data was written to disk, but it wasn’t. Hardware issues or file system problems can cause this.

How to reduce risk of data loss

Tags: infrastructure