Kafka Interfaces Tutorial

5.1 Lesson 5—Kafka Interfaces

Hello and welcome to lesson 5 of the Apache Kafka Developer course offered by Simplilearn. This lesson provides information on Kafka interfaces.

5.1 Lesson 5—Kafka Interfaces

Hello and welcome to lesson 5 of the Apache Kafka Developer course offered by Simplilearn. This lesson provides information on Kafka interfaces.

5.2 Objectives

After completing this lesson, you will be able to: • Identify interfaces to Kafka • Explain how to use producer APIs, to create messages • Explain how to use consumer APIs, to read messages • List the steps to compile and run a Java Program for Kafka

5.3 Kafka Interfaces—Introduction

Kafka provides various commands to interface with Kafka cluster. It also provides: • libraries that can be called from other languages. • Java interface to develop programs for accessing messages. • commands to: o create and modify topics; o produce and consume messages; and o access classes within Kafka.

5.3 Kafka Interfaces—Introduction

Kafka provides various commands to interface with Kafka cluster. It also provides: • libraries that can be called from other languages. • Java interface to develop programs for accessing messages. • commands to: o create and modify topics; o produce and consume messages; and o access classes within Kafka.

5.4 Creating a Topic

Kafka provides the kafka-topics.sh command to create and modify topics. In our installation, this command is available in the /usr/local/kafka/bin directory and is already added to our path during the installation. If you run the command without parameters, it provides the usage of the command. Various options are available to create and modify a topic using this command. To create a topic, the following command can be used: create-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 2 --topic test. This creates a new topic called ‘test’ with the --topic option. A single ZooKeeper is specified for the cluster at localhost:2181. The topic is created with a replication factor ‘3’. This will tolerate 2 instance failures. The parallelism is set to 2 by the number of partitions. This command creates the topic ‘test’, so that you can create and read messages with this topic. The topic parameters can be modified with --alter option and can be deleted with --delete option.

5.5 Modifying a Topic

To modify a topic, the following command can be used: create-topics.sh --alter --zookeeper localhost:2181 --replication-factor 4 --topic test. This modifies the replication factor for the topic ‘test’ to 4. Remember, that we just created this topic with a replication factor ‘3’ in the previous screen. The alter option is used after creating a topic. This connects to the local ZooKeeper instance at port 2181.

5.6 kafka-topics.sh Options

The table displays the various options of kafka-topics.sh. The option --help displays the help message of command usage. The option --create is used to create a new topic. The option --delete is used to delete an existing topic. The option --list displays all the available topics. The option --describe is used to display information about a topic. The option --partitions is used to indicate the number of partitions of a message. It controls the parallelism of a topic. The option --replication-factor is used to specify the number of copies of a message to be retained to achieve fault tolerance. The option --topic is used to specify a topic. The option --ZooKeeper is used to specify the list of ZooKeeper URLs to connect to.

5.7 Creating a Message

Kafka provides the command “kafka-console-producer.sh” to create messages. The command without parameters provides the usage of the command. Various options are available to control the creation of messages.

5.8 kafka-console-producer.sh Options

The table displays the various options of the command “kafka-console-producer.sh.” The option --broker-list is used to specify a list of Kafka brokers to connect to. This can be specified as a comma separated list. Each broker is specified as a host name or IP address and a port number, separated by the: character. For example, “broker1:9092,broker2:9093” or “192.168.10.1:9092,192.q68.10.2:9093.” The option –topic is used to specify a topic for the messages. The partition is chosen at random. There are two types of modes to send messages. They are: synchronous and asynchronous. Asynchronous is the default. In this mode, messages are batched together and sent in a batch of 200 messages each. In synchronous mode, messages are sent as soon as they are created. This can be specified with the --sync option. The batch size can be overridden with the --batch option. In asynchronous mode, a time out in milliseconds can be specified with the --timeout option. Once the timeout interval is elapsed, the messages are sent without waiting for the batch to become full. Note that unlike other Kafka commands, kafka-console-producer.sh takes a list of brokers instead of ZooKeeper URLs. The messages are taken from the console. Each line is treated as a message and is sent to the Kafka cluster.

5.9 Creating a Message—Example 1

To create a message, type kafka-console-producer.sh --broker-list localhost:9092 --topic test Next, type the messages on the screen as below: This is first message This is second message This is third message Press Ctrl+D. Note that Control+D means pressing the Control key and the letter D together. This represents the end of a file character in Linux and terminates the program. Each line above is treated as a separate message. So three messages are added to the topic test. It connects to the Kafka broker at port 9092. The messages are added to a partition chosen at random. This command does not have an option to choose the partition for the topic. You can also create messages from a file by using the ‘

5.9 Creating a Message—Example 1

To create a message, type kafka-console-producer.sh --broker-list localhost:9092 --topic test Next, type the messages on the screen as below: This is first message This is second message This is third message Press Ctrl+D. Note that Control+D means pressing the Control key and the letter D together. This represents the end of a file character in Linux and terminates the program. Each line above is treated as a separate message. So three messages are added to the topic test. It connects to the Kafka broker at port 9092. The messages are added to a partition chosen at random. This command does not have an option to choose the partition for the topic. You can also create messages from a file by using the ‘

5.10 Creating a Message—Example 2

To create messages from a file, type kafka-console-producer.sh --broker-list localhost:9092 --topic test < message.txt, where the file message.txt contains the following data: IBM,100 DEL,200 ABC,120 XYZ,340 AXL,212 This is useful to create a large number of messages at once.

5.10 Creating a Message—Example 2

To create messages from a file, type kafka-console-producer.sh --broker-list localhost:9092 --topic test < message.txt, where the file message.txt contains the following data: IBM,100 DEL,200 ABC,120 XYZ,340 AXL,212 This is useful to create a large number of messages at once.

5.11 Reading a Message

Kafka provides the command “kafka-console-consumer.sh” to read messages. The command without parameters provides the usage of the command. Various options are available to control the reading of messages. The messages read are displayed on the screen. The command continues to read the messages until it is terminated.

5.12 kafka-console-consumer.sh Options

The table displays the various options of the command “kafka-console-consumer.sh.” The option --ZooKeeper is used to specify the list of ZooKeeper servers to connect to. This can be specified as, a comma separated list. Each ZooKeeper server is specified as a host name or IP address and a port number, separated by the : character. For example “host1:2181,host2:2181”. Here, 2181 is the default port for the ZooKeeper. The option --topic is used to specify a topic for the messages. The option --from-beginning is specified to read all the messages from the topic irrespective of when the message was created. Without this option, only the messages created after the starting of the command are read. The option --whitelist is used to specify the list of topics to include and –blacklist is used to exclude a list of topics. One of these three options has to be specified. A properties file can be provided by using the --consumer.config option. A sample property file is available as a /usr/local/kafka/config/consumer.properties file. Properties such as the consumer group for the consumer and the partition assignment strategy can be specified in the properties file.

5.13 Reading a Message—Example

To read a message, type kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning This gives following three lines as output: This is first message This is second message This is third message This reads the messages from the topic ‘test’ by connecting to the Kafka cluster through the ZooKeeper at port 2181. Since from-beginning is specified, all the messages from the topic are read and displayed on the screen. The command continues to read the messages until it is terminated; however, you can terminate it by pressing Ctrl+C.

5.13 Reading a Message—Example

To read a message, type kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning This gives following three lines as output: This is first message This is second message This is third message This reads the messages from the topic ‘test’ by connecting to the Kafka cluster through the ZooKeeper at port 2181. Since from-beginning is specified, all the messages from the topic are read and displayed on the screen. The command continues to read the messages until it is terminated; however, you can terminate it by pressing Ctrl+C.

5.14 Java Interface to Kafka

Kafka provides two types of Java APIs for interfacing with the Kafka cluster. They are: producer side and consumer side APIs. The producer side APIs provide the interface to put messages into Kafka, whereas, the consumer side APIs provide the interface to read messages from Kafka.

5.15 Producer Side API

The producer side APIs provide interface to connect to the cluster and insert messages into a topic. The steps involved in programming are as follows: 1. Set up a producer configuration 2. Get a handle to the producer connection 3. Create messages as key and value pairs 4. Submit the messages to a particular topic 5. Close the connection By default, a message is submitted to a particular partition of the topic based on the hash value of the key. A Programmer can override this with a custom partitioner. Detailed java documentation is available from the Apache Kafka site.

5.16 Producer Side API Example—Step 1

The screen illustrates the code fragment of step 1 for the producer side API. In step 1, setup the producer configuration. The program fragment illustrates getting a new property object, and setting the properties for broker list and serializer properties. The property object is used to get a new producer configuration object. Serializer is used to encode the message data for read by the consumer. The serializer used here is the default string encoder provided by Kafka.

5.16 Producer Side API Example—Step 1

The screen illustrates the code fragment of step 1 for the producer side API. In step 1, setup the producer configuration. The program fragment illustrates getting a new property object, and setting the properties for broker list and serializer properties. The property object is used to get a new producer configuration object. Serializer is used to encode the message data for read by the consumer. The serializer used here is the default string encoder provided by Kafka.

5.17 Producer Side API Example—Step 2

In step 2, get a handle to the producer connection. The program fragment illustrates getting a new producer object using the configuration we just created. This producer object represents the handle to the producer for the Kafka cluster.

5.17 Producer Side API Example—Step 2

In step 2, get a handle to the producer connection. The program fragment illustrates getting a new producer object using the configuration we just created. This producer object represents the handle to the producer for the Kafka cluster.

5.18 Producer Side API Example—Step 3

In step 3, create the messages to be sent. The program fragment illustrates creating two messages, each as a key and corresponding value. You can create the keys and values as arrays as well, so that a large number of messages can be sent to Kafka. You can also read from a file and send the messages.

5.18 Producer Side API Example—Step 3

In step 3, create the messages to be sent. The program fragment illustrates creating two messages, each as a key and corresponding value. You can create the keys and values as arrays as well, so that a large number of messages can be sent to Kafka. You can also read from a file and send the messages.

5.19 Producer Side API Example—Step 4

In step 4, submit the messages to a particular topic. The program fragment illustrates creating a keyed message object with the key and the messages created in the previous step and then sending the messages through the producer handle. If you have a large amount of messages, these statements can be made in a loop to send the messages.

5.19 Producer Side API Example—Step 4

In step 4, submit the messages to a particular topic. The program fragment illustrates creating a keyed message object with the key and the messages created in the previous step and then sending the messages through the producer handle. If you have a large amount of messages, these statements can be made in a loop to send the messages.

5.20 Producer Side API Example—Step 5

In step 5, close the connection. The program fragment illustrates closing the connection using the producer handle. Full program TestProducer.java is available for download from the Simplilearn site. The compilation steps are explained in a later screen.

5.20 Producer Side API Example—Step 5

In step 5, close the connection. The program fragment illustrates closing the connection using the producer handle. Full program TestProducer.java is available for download from the Simplilearn site. The compilation steps are explained in a later screen.

5.21 Consumer Side API

The consumer side APIs provide interface to connect to the cluster and get messages from a topic. The steps involved in programming are: 1. Set up consumer configuration 2. Get a handle to the Consumer connection 3. Get a stream of messages for a topic 4. Loop over messages and process the messages 5. Close the connection Messages can be read from a particular partition or from all the partitions. Messages are read in the order that they are produced.

5.22 Consumer Side API Example—Step 1

The screen illustrates the program fragment of step 1 for the Consumer side API. In step 1, setup the consumer configuration. The program fragment illustrates getting a new property object, setting the properties for the ZooKeeper connection and consumer group ID. The property object is used to get a new consumer configuration object.

5.22 Consumer Side API Example—Step 1

The screen illustrates the program fragment of step 1 for the Consumer side API. In step 1, setup the consumer configuration. The program fragment illustrates getting a new property object, setting the properties for the ZooKeeper connection and consumer group ID. The property object is used to get a new consumer configuration object.

5.23 Consumer Side API Example—Step 2

In step 2, get a handle to the consumer connection. The program fragment illustrates getting a new consumer object using the configuration we just created. This consumer object represents the handle to the consumer for the Kafka cluster.

5.23 Consumer Side API Example—Step 2

In step 2, get a handle to the consumer connection. The program fragment illustrates getting a new consumer object using the configuration we just created. This consumer object represents the handle to the consumer for the Kafka cluster.

5.24 Consumer Side API Example—Step 3

In step 3, we get a stream of messages for the topic. The program fragment illustrates creating a hash map with the topic as key and number 1 as value and then creating a stream with this hash map for the topic. The stream is created as a list object. A hash map is a collection of key value pairs.

5.24 Consumer Side API Example—Step 3

In step 3, we get a stream of messages for the topic. The program fragment illustrates creating a hash map with the topic as key and number 1 as value and then creating a stream with this hash map for the topic. The stream is created as a list object. A hash map is a collection of key value pairs.

5.25 Consumer Side API Example—Step 4

In Step 4, we loop over messages and process the messages. The program fragment illustrates looping over the message streams we got in the previous step and processing each message in the stream. The hasNext checks if there is another message and waits for a new message if no message is waiting to be read. Each message read from Kafka is printed on the screen using the println command.

5.25 Consumer Side API Example—Step 4

In Step 4, we loop over messages and process the messages. The program fragment illustrates looping over the message streams we got in the previous step and processing each message in the stream. The hasNext checks if there is another message and waits for a new message if no message is waiting to be read. Each message read from Kafka is printed on the screen using the println command.

5.26 Consumer Side API Example—Step 5

In Step 5, close the connection. The program fragment illustrates shutting down the connection using the consumer handle. It is possible that the program may never reach this point as it is always waiting for a message. In that case we have to forcefully terminate the program with Ctrl+C. The full program TestConsumer.java is available for download from the Simplilearn site.

5.26 Consumer Side API Example—Step 5

In Step 5, close the connection. The program fragment illustrates shutting down the connection using the consumer handle. It is possible that the program may never reach this point as it is always waiting for a message. In that case we have to forcefully terminate the program with Ctrl+C. The full program TestConsumer.java is available for download from the Simplilearn site.

5.27 Compiling a Java Program

Once the java program is created, we can compile and create a Jar on a virtual machine. The commands on the screen illustrates how to compile and create the jar file: 1. Set up CLASSPATH to point to Kafka libraries required for compilation 2. Create a directory for the classes 3. Compile the Java program 4. Create a Jar file 5. Check the Jar file

5.28 Running the Java Program

Use the command given below to run the Java program created in the previous screen. Note that the full path name of the class is specified. Arguments represent the command line options required for the program. For example, TestConsumer does not take any arguments.

5.29 Java Interface Observations

The observations that can be made from the Java interface are: The Java interface depends on the version of Kafka; so, you need to check the documentation for your version of Kafka. The consumer API we used is called the high level consumer API. Lower level APIs are also provided but are difficult to use. The high level consumer API hides details such as selection of a leader. The Java interface supports threading so you can have multiple threads to create and read messages. Please go through the lab exercises and perform the tasks given in the following screens.

5.42 Quiz

A few questions will be presented in the following screens. Select the correct option and click submit to see the feedback.

5.43 Summary

Here is a quick recap of what we have learned in this lesson: • Kafka provides a command line interface to create and alter topics. • Kafka provides a command line interface to read messages. • The producer side APIs add messages to the cluster for a topic. • The consumer side APIs get messages for a topic as a stream of messages. • Once a java program is created, a Jar can be compiled and created on a virtual machine.

5.44 Thank You

This concludes ‘Kafka Interfaces’ With this, we have come to the end of this course. Thank You and Happy Learning!

5.1 Lesson 5—Kafka Interfaces

Hello and welcome to lesson 5 of the Apache Kafka Developer course offered by Simplilearn. This lesson provides information on Kafka interfaces.

5.3 Kafka Interfaces—Introduction

Kafka provides various commands to interface with Kafka cluster. It also provides: • libraries that can be called from other languages. • Java interface to develop programs for accessing messages. • commands to: o create and modify topics; o produce and consume messages; and o access classes within Kafka.

5.9 Creating a Message—Example 1

To create a message, type kafka-console-producer.sh --broker-list localhost:9092 --topic test Next, type the messages on the screen as below: This is first message This is second message This is third message Press Ctrl+D. Note that Control+D means pressing the Control key and the letter D together. This represents the end of a file character in Linux and terminates the program. Each line above is treated as a separate message. So three messages are added to the topic test. It connects to the Kafka broker at port 9092. The messages are added to a partition chosen at random. This command does not have an option to choose the partition for the topic. You can also create messages from a file by using the ‘

5.10 Creating a Message—Example 2

To create messages from a file, type kafka-console-producer.sh --broker-list localhost:9092 --topic test < message.txt, where the file message.txt contains the following data: IBM,100 DEL,200 ABC,120 XYZ,340 AXL,212 This is useful to create a large number of messages at once.

5.13 Reading a Message—Example

To read a message, type kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning This gives following three lines as output: This is first message This is second message This is third message This reads the messages from the topic ‘test’ by connecting to the Kafka cluster through the ZooKeeper at port 2181. Since from-beginning is specified, all the messages from the topic are read and displayed on the screen. The command continues to read the messages until it is terminated; however, you can terminate it by pressing Ctrl+C.

5.16 Producer Side API Example—Step 1

The screen illustrates the code fragment of step 1 for the producer side API. In step 1, setup the producer configuration. The program fragment illustrates getting a new property object, and setting the properties for broker list and serializer properties. The property object is used to get a new producer configuration object. Serializer is used to encode the message data for read by the consumer. The serializer used here is the default string encoder provided by Kafka.

5.17 Producer Side API Example—Step 2

In step 2, get a handle to the producer connection. The program fragment illustrates getting a new producer object using the configuration we just created. This producer object represents the handle to the producer for the Kafka cluster.

5.18 Producer Side API Example—Step 3

In step 3, create the messages to be sent. The program fragment illustrates creating two messages, each as a key and corresponding value. You can create the keys and values as arrays as well, so that a large number of messages can be sent to Kafka. You can also read from a file and send the messages.

5.19 Producer Side API Example—Step 4

In step 4, submit the messages to a particular topic. The program fragment illustrates creating a keyed message object with the key and the messages created in the previous step and then sending the messages through the producer handle. If you have a large amount of messages, these statements can be made in a loop to send the messages.

5.20 Producer Side API Example—Step 5

In step 5, close the connection. The program fragment illustrates closing the connection using the producer handle. Full program TestProducer.java is available for download from the Simplilearn site. The compilation steps are explained in a later screen.

5.22 Consumer Side API Example—Step 1

The screen illustrates the program fragment of step 1 for the Consumer side API. In step 1, setup the consumer configuration. The program fragment illustrates getting a new property object, setting the properties for the ZooKeeper connection and consumer group ID. The property object is used to get a new consumer configuration object.

5.23 Consumer Side API Example—Step 2

In step 2, get a handle to the consumer connection. The program fragment illustrates getting a new consumer object using the configuration we just created. This consumer object represents the handle to the consumer for the Kafka cluster.

5.24 Consumer Side API Example—Step 3

In step 3, we get a stream of messages for the topic. The program fragment illustrates creating a hash map with the topic as key and number 1 as value and then creating a stream with this hash map for the topic. The stream is created as a list object. A hash map is a collection of key value pairs.

5.25 Consumer Side API Example—Step 4

In Step 4, we loop over messages and process the messages. The program fragment illustrates looping over the message streams we got in the previous step and processing each message in the stream. The hasNext checks if there is another message and waits for a new message if no message is waiting to be read. Each message read from Kafka is printed on the screen using the println command.

5.26 Consumer Side API Example—Step 5

In Step 5, close the connection. The program fragment illustrates shutting down the connection using the consumer handle. It is possible that the program may never reach this point as it is always waiting for a message. In that case we have to forcefully terminate the program with Ctrl+C. The full program TestConsumer.java is available for download from the Simplilearn site.

  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Request more information

For individuals
For business
Name*
Email*
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Email*
Phone Number*
Company*
Job Title*