Introduction to Zookeeper Tutorial

Welcome to the second chapter of the Apache Kafka tutorial (part of the Apache Kafka Course.) This lesson provides an Introduction to Apache Zookeeper tutorial.

In the next section of this Apache kafka tutorial, we will discuss objectives of Apache Zookeeper.


After completing this lesson, you will be able to:

  • Describe what Apache Zookeeper is and how it functions.

  • Explain some of the common problems of distributed systems.

  • Illustrate the data model for Apache Zookeeper.

  • Compare how the two types of znodes are different from each other.

Furthermore, this lesson will help you to discuss a few Apache Zookeeper recipes and the way they handle some of the problems of distributed systems.

In the next section of this Apache Zookeeper tutorial, we will discuss an Introduction to Apache Zookeeper.

Apache Zookeeper - Introduction

Apache Zookeeper is a coordination service of Apache that helps manage the activities of distributed applications.

It is highly scalable. It also has an open source library of recipes for distributed systems, such as leader selection, exclusive locks, bookkeeper, and so on. These recipes facilitate building relations between distributed processes and applications.

One of the main features of Apache Zookeeper is that it helps handle partial failures in distributed systems.

Wish to have in-depth knowledge about the Apache Kafka platform? Click here to know more!

Distributed Applications in Apache Zookeeper

Before you learn about Apache Zookeeper, let us understand what distributed applications are and what sort of problems arise while using them.

Distributed applications are run on multiple machines in parallel. They function by following the divide and conquer principle. This means that they divide large jobs into smaller jobs, which are then run in parallel on multiple machines. They are horizontally scalable if adding more machines reduces the execution time.

For example, if 10 machines do a job in 10 hours, adding 10 more machines may halve the execution time to 5 hours.

They are vertically scalable if increasing the memory, CPU, or other resources of each machine reduces the execution time. For example, increasing the memory from 100 GB to 256 GB may reduce the execution time of a job from 10 hours to 5 hours.

The diagram below shows three machines connected by a network switch.

An application can be distributed to run on them in parallel. These machines are also referred to as nodes in a cluster.

In the next section of Apache Zookeeper tutorial, we will discuss challenges involved in distributed applications in Apache Zookeeper.

Challenges of Distributed Applications

One of the major issues that crop up while using distributed systems is ‘partial failure’. Another common problem is a ‘race condition’. Deadlocks and inconsistent states are also some challenges of distributed systems.

Let us look at each of these difficulties in detail in the next subsequent sections.

Partial Failures in Apache Zookeeper

Partial failure is a major challenge in distributed applications.

Suppose there are two nodes—Node 1 and Node 2—in a distributed system.

Node 1 sends a message to Node 2 through the network. However, the network fails before Node 1 receives an acknowledgment from Node 2.

As a result, Node 1 does not know if Node 2 got the message or not. It will know the actual status only after the network gets connected again. This is known as a partial failure in distributed applications.

Though partial failures cannot be prevented, tools like Apache Zookeeper provide a mechanism to handle them efficiently.

Race Conditions in Apache Zookeeper

A race condition takes place in distributed applications when multiple machines are waiting for one resource to become free. Suppose there are four different nodes in a distributed system.

Let us assume that currently, only Node1 is using the resource. So, it has an exclusive lock on the resource.

All the other nodes—from Node 2 to Node 4—are waiting for the resource to become available.

When Node 1 releases the resource, Nodes 2 to 4 race to acquire the resource. Only one of them succeeds, while the others go back to the waiting state. This process continues till all the nodes get the resource. This is called a race condition.

Deadlocks in Apache Zookeeper

Deadlocks occur when there is a cyclic dependency on resources.

The diagram below illustrates a deadlock situation.

There are two machines—Machine 1 and Machine 2; and there are two resources—resource A and resource B.

Machine 1 has locked resource A and is waiting to lock resource B. At the same time, machine 2 has locked resource B and is waiting to lock resource A.

Since none of the locks can be acquired or released, it leads to a deadlock. To resolve a deadlock, one of the processes has to be killed and redo the processing. Detecting deadlocks are generally CPU-intensive and expensive operations.

Inconsistencies in Apache Zookeeper

Inconsistencies take place when changes are not propagated to all the machines in a distributed system.

For example, let us consider a salary data which is initially equal to 100. This data is replicated to both the machines. The data is later updated to 200. This change is first propagated to Machine 1. The value stored on it is now changed to 200.

However, due to some failure, this update is not propagated to Machine 2.

So, the value stored on it remains as 100. In such a situation, if process A reads from Machine 1, while process B reads from Machine 2, each process will get two separate values of the salary data. This leads to inconsistencies.

In the next section of this Apache Zookeeper tutorial, we will discuss Apache Zookeeper Characteristics.

Apache Zookeeper Characteristics

Apache Zookeeper helps coordinate distributed applications. It provides a very simple interface. It is expressive, which means that it provides basic blocks that can be used to build larger applications. It is also highly available and reliable. This is because it runs on multiple servers at the same time.

So, even if a few servers fail, it continues to function. To have a fault-tolerance for ‘n’ machine failures, it is recommended to have two ‘n’ plus one machine running the Apache Zookeeper service.

For example, if you want to have a fault tolerance of 3 machines, then you should have Apache Zookeeper running on 7 machines.

Another feature of Apache Zookeeper is that it has loosely coupled interactions. Machines using it do not have to know each other. Apache Zookeeper is actually an extensive library of recipes for distributed coordination.

In the next section of this Apache Zookeeper tutorial, we will discuss Apache Zookeeper Data Model.

Apache Zookeeper Data Model

The Apache Zookeeper data model consists of a hierarchical tree of nodes called znodes. The tree of nodes is similar to a directory structure in Linux. Each znode stores a small amount of data, and has an associated Access Control List or ACL.

ACL represents which users can read, write and/or update the znode. A znode can only store a maximum limit of 1MB data.

The diagram here shows a tree structure of znodes.

There are two znodes ‘/kafka’ and ‘/hbase’ at the root level. At the second level, there are two more znodes ‘/kafka/node0000’ and ‘/kafka/node0001’.

In the next few sections of this Apache Zookeeper tutorial, we’ll discuss types of Znodes in Apache Zookeeper.

Types of Znodes in Apache Zookeeper

Apache Zookeeper znodes can be of two types: Persistent znodes and ephemeral znodes.

Note that when a znode is created, its type is specified and it cannot be changed later.

Persistent znodes

Persistent znodes are permanent and have to be deleted explicitly by the client. They stay even after the session that created the znode is terminated.

Ephemeral znodes

Ephemeral znodes are temporary. These znodes are deleted automatically when the client session creating them ends. Ephemeral znodes are used to detect the termination of a client.

Alerts, known as a watch, can be set up to detect the deletion of the znode.

Sequential Znodes in Apache Zookeeper

Znodes can be sequential. To do this, you can set a sequence flag while creating a znode. The value of this flag is an increasing counter. The counter is maintained by the parent znode for sequential znodes. The sequence number is appended to the name of the znode.

Sequential znodes are used to specify the ordering of the znodes. In the diagram, you can see that under ‘/kafka’ parent, three sequential znodes—node0000, node0001, and node0002—are created


In a later lesson, you will learn how to install Apache Zookeeper and Kafka on an Ubuntu Linux system. However, if you need to work on an operating system other than Linux, you can access the software provided by VMware.

This software allows running one operating system on another using a virtual machine. This is facilitated by VMware Player. For non-commercial use, VM Player can be downloaded and used free of cost from the VMware website.

Simplilearn Virtual Machine

Simplilearn has created a virtual machine on VMware Player.

This machine, known as Hadoop Pseudo Server, comes with a preinstalled Ubuntu 12.04 LTS operating system and Hadoop setup. It can be opened with the VMWare Player and can be used for installing Kafka.

Hadoop Pseudo Server can be downloaded from the given link.


PuTTY is a popular, free tool for connecting to Linux systems from Windows through a remote terminal. It overcomes some of the limitations of the VM.

For example, it allows moving the mouse pointer with ease, scrolling in the window, and copying and pasting text. PuTTY can be downloaded from the given link.


Winscp is a popular tool for copying files between Windows and Linux. It stands for Windows secure copy. It can be used to copy the files from local Windows to the Ubuntu VM running in the VM Player. WinSCP can be downloaded from the given link.

Apache Zookeeper Installation

To install Apache Zookeeper on the Simplilearn VM, first, update the installation libraries. Then, use the apt-get installer command to install Apache Zookeeper. When prompted to enter a password, type ‘simplilearn’, all in lowercase. Type ‘Y’, if asked for any confirmation

Apache Zookeeper Configuration

To start the Apache Zookeeper server, you first have to configure it. To do that, set up the directory permissions for Apache Zookeeper. You can then, start the Apache Zookeeper server using the command given here.

Note that the Apache Zookeeper server listens on the port 2181 by default.

Apache Zookeeper Command Line Interface

Apache Zookeeper installation comes with a command line interface. If you recall, 2181 is the default port for Apache Zookeeper. The command line interface gives the following command prompt. The command line interface can be used to check the znodes and create new znodes.

In the commands given here, the values 200, 201, and 202 are the data we want to associate with the znodes.

Apache Zookeeper Command Line Interface Commands

The table shows the commands that can be used in the Apache Zookeeper command line interface. The ‘create’ command can be used to create a znode. The option ‘–e’ is used to create an ephemeral znode.

However, if you do not specify this option, a persistent znode is created. The option ‘–s’ is used to specify a sequential znode. The ‘help’ command is used to get information about the available commands.

‘Path’ is the full path of the znode to be created and starts with a ‘forward slash’. Some data has to be associated with the znode and is specified with the data. The ‘ls’ command is used to list the znode directory tree.

A path starting with a forward slash is specified. You can also add a watch on the path to be alerted about any changes to the path. ‘Get’ and ‘set’ commands can be used to fetch the data of the znode or to update the data. ‘Delete’ can be used to delete the znode at the path.

Finally, you can use ‘quit’ to exit the command line interface.

Apache Zookeeper Client APIs

Like the command line interface, Apache Zookeeper also provides APIs that can be called from either Java or any other language.

Apache Zookeeper has different APIs to create znodes, add watches, getting and setting data, as well as to delete znodes.

In addition, it also has APIs that can be used to get all the children of a znode and to check if a znode exists. There is also an API to synchronize all the znodes so that all of them have the same data.

Apache Zookeeper Recipe 1: Handling Partial Failures

When a process is sending data to another process, it has to handle partial failures. Apache Zookeeper provides a recipe to tackle such a situation.

In the example given here, Process 2 is trying to send information to Process 1.

The receiver creates an ephemeral znode with the same name as the session id or process name. In this case, the znode is created with the name ‘/process1’. The sender keeps a watch on the receiver’s znode. If the data reaches the receiver process successfully, it informs the sender and then deletes the znode.

However, if Process 1 fails, the znode is automatically removed. The sender gets an alert about this change through the watch. It then, takes an appropriate action, such as resending the message.

In this process, we use watches and the ephemeral character of znodes to handle partial failures.

Apache Zookeeper Recipe 2: Leader Election

Let us now look at another Apache Zookeeper recipe. Leader election uses ephemeral sequential nodes to create an automatic node order.

Let us see how the three processes—Process 1, Process 2, and Process 3—use the leader election mechanism.

Each process creates a sequential ephemeral znode under parent ‘/kafka’ with the prefix node.

Thus Process 1 gets the znode ‘/kafka/node000’, Process 2 gets the znode ‘/kafka/node001’, and Process 3 gets the znode ‘/kafka/node002’. The znode with the least sequential value is chosen as the leader.

Process 2 and Process 3, in turn, become followers. So, Process 2 watches its immediate preceding sequence number ‘/kafka/node000’. Likewise, Process 3 watches ‘/kafka/node001’.

To automatically reorder nodes, the sequential property of znodes is used. If Process 1 completes successfully, it deletes the znode ‘/kafka/node000’. This alerts Process 2 as it has a watch on the leader znode.

Now Process 2 becomes the leader as it has the lowest sequence value. Process 3 remains a follower as the preceding znode is not modified. In this case, the watch feature serves to change the leader. This process helps you avoid a race condition.

Note that only Process 1 and Process 2 are handling the leader change.

However, if Process 1 dies before completion, then the znode ‘/kafka/node000’ is automatically deleted, as it is an ephemeral znode. This alerts Process 2 as it has a watch on the leader znode.

Now Process 2 becomes the leader as it has the lowest sequence value.

Process 3 remains a follower as the preceding znode is not modified. Leader election helps distributed processes to function by automatically handling a node failure.

Wish to know more about Apache Kafka platform? Click here to watch our course preview.


Here is a quick recap of what we have learned in this lesson.

  • Apache Zookeeper is a distributed coordination service.

  • Partial failure is a major issue in distributed coordination.

  • Apache Zookeeper uses a hierarchical tree of znodes.

  • Apache Zookeeper has two types of znodes—persistent and ephemeral.

  • Apache Zookeeper provides recipes for handling common problems in distributed systems. Sequential property is used to order znodes.

  • A watch can be set up to get alerts on changes to znodes.


This concludes the lesson on Introduction to Apache Zookeeper. The next lesson is Introduction to Kafka.

  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Request more information

For individuals
For business
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Work Email*
Phone Number*
Job Title*