The Complete Simplified Guide to Python Bokeh

Data visualization is an important aspect of machine learning. It allows you to explore your data and represent it in a form that other people can easily understand. While quite a few libraries are available for data visualization, the Python Bokeh library is easily the easiest and most flexible to use. 

Not only can you customize the graphs easily, but you can also create web layouts with them. In this tutorial on Python bokeh, you will take a look at the various ways to plot different graphs with bokeh. You will see how they can be customized and create a web layout too.

Caltech AI & Machine Learning Bootcamp

Advance Your AI & ML Career With Our BootcampEnroll Now
Caltech AI & Machine Learning Bootcamp

What Is Bokeh?

Bokeh is a Python library that is used to make highly interactive graphs and visualizations. This is done in bokeh using HTML and JavaScript. This makes it a powerful tool for creating projects, custom charts, and web design-based applications.                      

Python_Bokeh_1.

Figure 1: Python Bokeh

Scatter Charts

Scatter plots are a plot of each data point in the data. You use them to plot two numeric variables against one another, and each data point on the x-axis has a corresponding, individually plotted value on the y-axis. As a result, the plot looks like a bunch of scattered points.

Now, see how scatter plots are plotted in Python bokeh. You need to first start by importing all the necessary modules.                           

Python_Bokeh_2.

Figure 2: Importing Bokeh

To create a scatter plot, draw small circles corresponding to the x and y-coordinate points.

Python_Bokeh_3 

Figure 3: Scatter plot

Learn the Ins & Outs of Software Development

Caltech Coding BootcampExplore Program
Learn the Ins & Outs of Software Development

Line Chart

A line chart represents data as a series of points connected by a line. It is used to see trends in your data and track the way data changes over a period of time.

A line plot can be drawn with the help of the line function in the plotting module of bokeh. Plotting contains all the graphs that can be plotted in Python bokeh.

Python_Bokeh_4.

Figure 4: Line plot

To better understand how Python bokeh works, use the among us dataset. This contains information about 2227 games played by 29 users. Among us is a mobile game which 4 -10 people can play. The game takes place on a spaceship, and 1 - 2 people are the imposters while the others are crewmates. The imposters have to kill the crewmates, and the crewmates have to figure out who the imposters are. The game ends when all the imposters have been outed, or the same number of crewmates and imposters remain.

This dataset is available on Kaggle. You need to start by importing the data. As each user data is stored in a different file, you read the contents of each file into a pandas dataset.                              

Python_Bokeh_5.

Figure 5: Importing our dataset

The data looks as shown below:

Python_Bokeh_6

Figure 6: Among us dataset

Now, use the describe function to see statistical information about your dataset.

Python_Bokeh_7

Figure 7: Dataset Statistics

The ‘Game Length’ column tells you the duration of each game. The time contained in the column is in the form of minutes and seconds. Next, split the column and extract only the minutes from it. 

Python_Bokeh_8.

Figure 8: Creating a new column

The murdered column contains only yes/no entries. Now, change them to Murdered, Not Murdered, and Missing. Your final dataset is shown below.

Python_Bokeh_9

Python_Bokeh_9_1.

Figure 9: Changing a column

Become The Highest-Paid Business Analysis Expert

With Business Analyst Master's ProgramExplore Now
Become The Highest-Paid Business Analysis Expert

Pie Charts

A pie chart is a circular chart divided into slices to represent how much data belongs to a specific category. It is a quick and easy way to see the classes in your data and the percentage of the dataset they represent. Donut charts are like pie charts with a hole in the center.

Now, use the among us data to plot a pie chart. A pie chart can be plotted by using plot_bokeh. The kind attribute is used to specify the kind of graph to be plotted. In this case, you will have to set it to ‘pie’.         

Python_Bokeh_10.

Figure 10: Pie Chart

From the above pie chart, you can infer that around 75% of the data falls in the crewmate category and 25% in the imposter category.

Another type of circular chart is a Donut Chart. A donut chart is a kind of pie chart with a circular space in the middle. The extra space can be used to represent data, or another graph can be added.

Now, sum the counts of different categories present in the Murdered column. You will then convert the counts of each type into angles. And finally, you will allocate a different color to each category. All of this information will be stored in a new dataset df_mur.

Python_Bokeh_11

Figure 11: Creating a new dataset for donut plot 

You can plot the donut chart by using the angular_wedge function. 

Python_Bokeh_12.

Figure 12: Creating a donut plot 

The below graph shows the donut plot obtained after running the above code. The values of Murdered and Not Murdered are close to each other, with a significant amount of values missing. Because of this, you cannot determine if the majority of the people were murdered or not.

Python_Bokeh_13.

Figure 13: Donut plot 

Your AI/ML Career is Just Around The Corner!

AI Engineer Master's ProgramExplore Program
Your AI/ML Career is Just Around The Corner!

Histogram

Histograms are used to plot numerical data according to the range they fall into. The data is plotted in bins or rectangles. The y-axis corresponds to the amount of data present in a specific range or at a certain point.

You use the plot_bokeh function to plot a histogram of the minutes' column. This column tells you the duration of each game. Here, you need to change the kind attribute to ‘hist’.

Python_Bokeh_14

Figure 14: Histogram 

From the above graph, you can say that most games last for 6 - 14 minutes. 

To better plot categorical data, you can assign different colors to each category to understand how they compare. The below histogram shows the number of imposters and crewmates left at the end of each game. You are plotting the game length with the team column. 

Python_Bokeh_15

Figure 15: Stacked Histogram 

The above graph tells you that the longer a game goes on, the higher the chances of imposters getting caught. The games which go on beyond 4.5 minutes barely have any imposters left. 

Bar Plot

While histograms plot numerical data distributions, bar plots represent the data distribution for categorical data. It also uses bins to plot the amount of data. When the bins are stacked on top of each other, it is called a stacked bar chart.

To plot a bar graph, you are going to use the ‘bar’ attribute of plot_bokeh. The below bar graph shows you the number of people who have completed the tasks given to them.  

Python_Bokeh_16 

Figure 16: Bar graph 

Now, go ahead plot two bar graphs on top of each other. This type of bar chart is called a stacked bar graph. Here, plot the teams and outcomes against each other. A stacked bar graph can be plotted by changing the stacked attribute to true.       

Python_Bokeh_17.

Figure 17: Stacked Bar graph 

You can also plot bars horizontally using the ‘barh’ method of plot_bokeh. Here, you need to plot the outcomes and tasks together. 

Python_Bokeh_18 

Figure 18: Horizontally Stacked Bar graph 

Another bar graph is the stacked vertical bar graph. Here, one half of the graph is negative, and the other half is positive. To plot this, multiply the loss column by -1.                  

Python_Bokeh_19

Figure 19: Creating negative columns 

The below bar graph shows the users who won and lost and their user id.   

Python_Bokeh_20 

Figure 20: Stacked vertical bar graph 

Learn The Latest Trends in Data Analytics!

Post Graduate Program In Data AnalyticsExplore Program
Learn The Latest Trends in Data Analytics!

Area Plot

An area chart combines the line and bar chart when the area below a line is shaded. Using an area chart, you can see how the value of different groups changes over a period of time. The area chart has different baselines to show the vertical range of data.

You can plot an area chart by using the varea_stack method. Here, you plot the sabotages fixed, and the time they were fixed at. 

Python_Bokeh_21.

Python_Bokeh_21_1.

Figure 21: Area Plot 

From the above figure, you can see that fewer sabotages are fixed as time increases. 

Layout Function

The layout function in Python Bokeh is used to arrange our various plots and widgets. This makes it possible for us to see multiple graphs at the same time. Used primarily for designing dashboards, it lets you build grids of plots.

You can create a layout by using the grid function from bokeh.layouts. You start by creating multiple graphs. Here, you first plot a lollipop graph of the top 10 users with the most wins. You then make a donut graph of the ratio of murdered crewmates. You also must make a donut plot of the number of crewmates and imposters.

Python_Bokeh_22 

Figure 22: Creating multiple plots 

After plotting our graphs, you need to use the grid function to arrange them in a layout. The charts which are going to appear in the same row are placed in the same list. The lists are comma-separated to determine which graphs appear at the top and bottom. 

Python_Bokeh_23

Python_Bokeh_23_1

Figure 23: Bokeh Layout 

Become a Certified Expert in AWS, Azure and GCP

Caltech Cloud Computing BootcampExplore Program
Become a Certified Expert in AWS, Azure and GCP

Interactivity With Bokeh

Now, use Python bokeh to design a dashboard to represent the horsepower in cars. You will also make your graph interactive and see how maximum information can be conveyed with a single graph.

Start by importing all necessary modules into your program. 

Python_Bokeh_24.

Figure 24: Importing necessary modules

Then, you need to read your data in the form of a dataframe. You need to use the ColumnDataSource() function to convert the data into a format accepted by python bokeh. It is used to provide data to glyphs in bokeh.                         

Python_Bokeh_25.

Figure 25: Reading in data 

The below figure shows the car’s data frame. The data consists of the car name, horsepower, and price of each vehicle. There is a column that has the link to the car image.             

Python_Bokeh_26.

Figure 26: Cars Dataset 

Now, make a horizontal bar graph of the above data. You start by creating a plot of width 800 and height 600. The title you give to our plot is ‘Cars with Top Horsepower’, and the x-axis has the label ‘Horsepower’. You also need to specify the tools that will be used. 

Then add a horizontal bar graph to the graph and add a color palette.

Python_Bokeh_27

Figure 27: Creating a horizontal bar graph 

Finally, add a hover box and customize it with HTML to display the car price, horsepower, and image according to the link and display the graph.   

Python_Bokeh_28.

Figure 28: Creating a hover tool

The result is the graph as shown below. The cars are arranged in ascending order of their horsepower. The more the horsepower of a car, the darker their bin. The legend for the graph is given on the top right and tells you the car each color is associated with. 

Python_Bokeh_29

Figure 29: Cars Graph

Master Deep Learning, Machine Learning, and other programming languages with Artificial Intelligence Engineer Master’s Program

Conclusion 

In this Python bokeh tutorial, you first looked into bokeh and its different uses. You then took a look at how different types of graphs can be plotted in bokeh and how a layout can be created. Finally, you used a cars dataset to create a layout in Python bokeh.

We hope this helped you understand how to make interactive plots with Python Bokeh. To learn more about deep learning and machine learning, check out Simplilearn's Artificial Intelligence course. On the other hand, if you need any clarifications on this Python Bokeh tutorial, share them with us by commenting below, and we will have our experts review them at the earliest!

Happy learning!

About the Author

Avijeet BiswalAvijeet Biswal

Avijeet Biswal is an Assistant Manager at Simplilearn. He writes on leading technologies like Machine Learning, AI, and Data Analytics. His interest lies in generating actionable insights with data. Avijeet is fond of music and can spend an entire evening listening to the songs of an album he likes.

View More
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.