Data visualization is an important aspect of machine learning. It allows you to explore your data and represent it in a form that other people can easily understand. While quite a few libraries are available for data visualization, the Python Bokeh library is easily the easiest and most flexible to use.Ā
Not only can you customize the graphs easily, but you can also create web layouts with them. In this tutorial on Python bokeh, you will take a look at the various ways to plot different graphs with bokeh. You will see how they can be customized and create a web layout too.
What Is Bokeh?
Bokeh is a Python library that is used to make highly interactive graphs and visualizations. This is done in bokeh using HTML and JavaScript. This makes it a powerful tool for creating projects, custom charts, and web design-based applications. Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā
Figure 1: Python Bokeh
How to Install Bokeh?
Installing Bokeh, a well-known Python interactive visualization library, involves the following steps:
1. Making Use of PIP
- Launch a command prompt or terminal.
- Execute the subsequent command: pip install bokeh
2. Using Conda (If You Are Using Anaconda):
- Open a terminal or Anaconda prompt.
- Run the following command: conda install bokeh
3. Verifying the Installation:
After installation, you can verify that Bokeh is installed correctly by starting a Python session and importing Bokeh:
- Open a terminal or command prompt.
- Run Python: Python
- Import Bokeh:
import bokeh
print(bokeh.__version__)
If there are no errors and the version number is displayed, Bokeh has been installed successfully. You are now ready to create interactive visualizations with Bokeh!
Scatter Charts
Scatter plots are a plot of each data point in the data. You use them to plot two numeric variables against one another, and each data point on the x-axis has a corresponding, individually plotted value on the y-axis. As a result, the plot looks like a bunch of scattered points.
Now, see how scatter plots are plotted in Python bokeh. You need to first start by importing all the necessary modules.Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā
Figure 2: Importing Bokeh
To create a scatter plot, draw small circles corresponding to the x and y-coordinate points.
Ā
Figure 3: Scatter plot
Line Chart
A line chart represents data as a series of points connected by a line. It is used to see trends in your data and track the way data changes over a period of time.
A line plot can be drawn with the help of the line function in the plotting module of bokeh. Plotting contains all the graphs that can be plotted in Python bokeh.
Figure 4: Line plot
To better understand how Python bokeh works, use the among us dataset. This contains information about 2227 games played by 29 users. Among us is a mobile game which 4 -10 people can play. The game takes place on a spaceship, and 1 - 2 people are the imposters while the others are crewmates. The imposters have to kill the crewmates, and the crewmates have to figure out who the imposters are. The game ends when all the imposters have been outed, or the same number of crewmates and imposters remain.
This dataset is available on Kaggle. You need to start by importing the data. As each user data is stored in a different file, you read the contents of each file into a pandas dataset.Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā
Figure 5: Importing our dataset
The data looks as shown below:
Figure 6: Among us dataset
Now, use the describe function to see statistical information about your dataset.
Figure 7: Dataset Statistics
The āGame Lengthā column tells you the duration of each game. The time contained in the column is in the form of minutes and seconds. Next, split the column and extract only the minutes from it.Ā
Figure 8: Creating a new column
The murdered column contains only yes/no entries. Now, change them to Murdered, Not Murdered, and Missing. Your final dataset is shown below.
Figure 9: Changing a column
Elevate your coding skills with Simplilearn's Python Training! Enroll now to unlock your potential and advance your career.
Pie Charts
A pie chart is a circular chart divided into slices to represent how much data belongs to a specific category. It is a quick and easy way to see the classes in your data and the percentage of the dataset they represent. Donut charts are like pie charts with a hole in the center.
Now, use the among us data to plot a pie chart. A pie chart can be plotted by using plot_bokeh. The kind attribute is used to specify the kind of graph to be plotted. In this case, you will have to set it to āpieā.Ā Ā Ā Ā Ā Ā Ā Ā Ā
Figure 10: Pie Chart
From the above pie chart, you can infer that around 75% of the data falls in the crewmate category and 25% in the imposter category.
Another type of circular chart is a Donut Chart. A donut chart is a kind of pie chart with a circular space in the middle. The extra space can be used to represent data, or another graph can be added.
Now, sum the counts of different categories present in the Murdered column. You will then convert the counts of each type into angles. And finally, you will allocate a different color to each category. All of this information will be stored in a new dataset df_mur.
Figure 11: Creating a new dataset for donut plotĀ
You can plot the donut chart by using the angular_wedge function.Ā
Figure 12: Creating a donut plotĀ
The below graph shows the donut plot obtained after running the above code. The values of Murdered and Not Murdered are close to each other, with a significant amount of values missing. Because of this, you cannot determine if the majority of the people were murdered or not.
Figure 13: Donut plot
Histogram
Histograms are used to plot numerical data according to the range they fall into. The data is plotted in bins or rectangles. The y-axis corresponds to the amount of data present in a specific range or at a certain point.
You use the plot_bokeh function to plot a histogram of the minutes' column. This column tells you the duration of each game. Here, you need to change the kind attribute to āhistā.
Figure 14: Histogram
From the above graph, you can say that most games last for 6 - 14 minutes.
To better plot categorical data, you can assign different colors to each category to understand how they compare. The below histogram shows the number of imposters and crewmates left at the end of each game. You are plotting the game length with the team column.
Figure 15: Stacked HistogramĀ
The above graph tells you that the longer a game goes on, the higher the chances of imposters getting caught. The games which go on beyond 4.5 minutes barely have any imposters left.Ā
Bar Plot
While histograms plot numerical data distributions, bar plots represent the data distribution for categorical data. It also uses bins to plot the amount of data. When the bins are stacked on top of each other, it is called a stacked bar chart.
To plot a bar graph, you are going to use the ābarā attribute of plot_bokeh. The below bar graph shows you the number of people who have completed the tasks given to them.Ā Ā
Ā
Figure 16: Bar graphĀ
Now, go ahead plot two bar graphs on top of each other. This type of bar chart is called a stacked bar graph. Here, plot the teams and outcomes against each other. A stacked bar graph can be plotted by changing the stacked attribute to true. Ā Ā Ā Ā
Figure 17: Stacked Bar graphĀ
You can also plot bars horizontally using the ābarhā method of plot_bokeh. Here, you need to plot the outcomes and tasks together.Ā
Ā
Figure 18: Horizontally Stacked Bar graphĀ
Another bar graph is the stacked vertical bar graph. Here, one half of the graph is negative, and the other half is positive. To plot this, multiply the loss column by -1.Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā
Figure 19: Creating negative columnsĀ
The below bar graph shows the users who won and lost and their user id.Ā Ā Ā
Ā
Figure 20: Stacked vertical bar graphĀ
Area Plot
An area chart combines the line and bar chart when the area below a line is shaded. Using an area chart, you can see how the value of different groups changes over a period of time. The area chart has different baselines to show the vertical range of data.
You can plot an area chart by using the varea_stack method. Here, you plot the sabotages fixed, and the time they were fixed at.Ā
Figure 21: Area PlotĀ
From the above figure, you can see that fewer sabotages are fixed as time increases.Ā
Layout Function
The layout function in Python Bokeh is used to arrange our various plots and widgets. This makes it possible for us to see multiple graphs at the same time. Used primarily for designing dashboards, it lets you build grids of plots.
You can create a layout by using the grid function from bokeh.layouts. You start by creating multiple graphs. Here, you first plot a lollipop graph of the top 10 users with the most wins. You then make a donut graph of the ratio of murdered crewmates. You also must make a donut plot of the number of crewmates and imposters.
Ā
Figure 22: Creating multiple plotsĀ
After plotting our graphs, you need to use the grid function to arrange them in a layout. The charts which are going to appear in the same row are placed in the same list. The lists are comma-separated to determine which graphs appear at the top and bottom.Ā
Figure 23: Bokeh Layout
Interactivity With Bokeh
Now, use Python bokeh to design a dashboard to represent the horsepower in cars. You will also make your graph interactive and see how maximum information can be conveyed with a single graph.
Start by importing all necessary modules into your program.Ā
Figure 24: Importing necessary modules
Then, you need to read your data in the form of a dataframe. You need to use the ColumnDataSource() function to convert the data into a format accepted by python bokeh. It is used to provide data to glyphs in bokeh.
Figure 25: Reading in dataĀ
The below figure shows the carās data frame. The data consists of the car name, horsepower, and price of each vehicle. There is a column that has the link to the car image.
Figure 26: Cars DatasetĀ
Now, make a horizontal bar graph of the above data. You start by creating a plot of width 800 and height 600. The title you give to our plot is āCars with Top Horsepowerā, and the x-axis has the label āHorsepowerā. You also need to specify the tools that will be used.
Then add a horizontal bar graph to the graph and add a color palette.
Figure 27: Creating a horizontal bar graphĀ
Finally, add a hover box and customize it with HTML to display the car price, horsepower, and image according to the link and display the graph.
Figure 28: Creating a hover tool
The result is the graph as shown below. The cars are arranged in ascending order of their horsepower. The more the horsepower of a car, the darker their bin. The legend for the graph is given on the top right and tells you the car each color is associated with.
Figure 29: Cars Graph
Elevate your coding skills with Simplilearn's Python Training! Enroll now to unlock your potential and advance your career.
ConclusionĀ
We have explored the powerful capabilities of Bokeh, a versatile visualization library in Python. We've covered what Bokeh is, the various types of graphs you can create, and how to arrange these graphs into sophisticated layouts. Bokeh's interactive features and beautiful visualizations make it an excellent choice for data scientists and developers who want to bring their data stories to life.
Ready to dive deeper into Python and master tools like Bokeh? Enroll in our comprehensive Python Training course. This course will equip you with the essential skills and knowledge to excel in Python development, helping you advance your career in the ever-evolving tech landscape.
FAQs
1. What Are the Main Components of Bokeh?
- Bokeh Server
- Bokeh Models
- Bokeh Documents
- Bokeh Protocol
- Bokeh Widgets
- Bokeh Applications
- Bokeh Layouts
- Bokeh Tools
- Bokeh Plots
- Bokeh Data Sources
2. How Do I Customize the Appearance of My Bokeh Plot?
To customize the appearance of your Bokeh plot, you can use various styling options provided by Bokeh. You can modify the properties of plot elements such as axes, grids, legends, and the plot area itself. For example, you can set the ātitleā,ā x_axis_labelā, and āy_axis_labelā properties to add descriptive titles and axis labels. You can also customize the appearance of glyphs (the visual shapes that represent your data points) by setting their attributes, such as ācolorā, āsizeā, āalphaā (transparency), āline_colorā, āline_widthā, āfill_colorā, and more. Additionally, you can use the āthemeā feature to apply a consistent style across your plots. The layout and position of plot components like legends and toolbars can also be adjusted to suit your preferences. Combining these customization options allows you to create visually appealing and informative plots tailored to your specific needs.
3. Is Bokeh Better Than Matplotlib?
Whether Bokeh is better than Matplotlib depends on your needs and use cases. Bokeh and Matplotlib are powerful visualization libraries that serve different purposes. Matplotlib is well-established and excels in creating static, publication-quality plots with fine-grained control over all plot aspects. It's ideal for creating detailed and complex visualizations for research and analysis.
Bokeh, on the other hand, is designed to create interactive and web-friendly visualizations.Ā It allows users to create plots easily embedded in web applications and provides dynamic interactions like zooming, panning, and tooltips. If your primary goal is to create interactive, web-based visualizations, Bokeh is likely the better choice.
However, if you need high-quality static plots for detailed analysis and publication, Matplotlib might be more suitable. Ultimately, the choice between Bokeh and Matplotlib depends on the specific requirements of your visualization project.
4. Is Bokeh Better Than Plotly?
It depends on your requirements whether Bokeh or Plotly is superior. Bokeh is perfect for intricate, interactive plots because of its exceptional adaptability and web application integration. In contrast, Plotly is easy to use, provides many pre-built chart styles, and facilitates rapid, excellent interactive visualizations. Select Plotly for user-friendliness quick creation of interactive charts, and Bokeh for extensive customization and web app connection.