APACHE KAFKA —A REAL TIME MESSAGING SYSTEM

Khemnath chauhan
2 min readJun 21, 2023
kafka

APACHE KAFKA:

As per the official page definition- Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

I will simply say- kafa is one of the medium used for data(message in kafka terms) transportation from source to destination system.

Below are four main parts in a Kafka system:

  • Broker: Handles all requests from clients (produce, consume, and metadata) and keeps data replicated within the cluster. There can be one or more brokers in a cluster.
  • Zookeeper: Keeps the state of the cluster (brokers, topics, users).
  • Producer: Sends records to a broker.
  • Consumer: Consumes batches of records from the broker.
kafka-Flow

KAFKA BROKER:

A Kafka cluster consists of one or more servers (Kafka brokers) running Kafka. Producers are processes that push records into Kafka topics within the broker. A consumer pulls records off a Kafka topic.

Running a single Kafka broker is possible but it doesn’t give all the benefits that Kafka in a cluster can give, for example, data replication.

Management of the brokers in the cluster is performed by Zookeeper. There may be multiple Zookeepers in a cluster, in fact the recommendation is three to five, keeping an odd number so that there is always a majority and the number as low as possible to conserve overhead resources.

KAFKA TOPIC:

In Kafka, a topic represents a category or container that holds messages related to a specific subject.

In Kafka, producers are responsible for publishing messages to topics. Producers write messages and attach them to the specific topic they want the messages to belong to.

Consumers subscribe to one or more topics and receive all the messages published on those topics. They can process the messages in real-time, store them for later analysis, or perform any other desired actions.

KAFKA INSTALLATION AND SETUP:

Apache Kafka is implemented primarily in Scala, a programming language that runs on the Java Virtual Machine (JVM). Kafka’s core components, including the broker, producer, and consumer, are indeed written in Scala.

Therefore, to execute Kafka and its core components, including the Kafka broker, you need to have a JVM installed on the machine where Kafka is running.

--

--