How to Build Interactive Data Visualizations for Python with Bokeh
- Bokeh is a powerful tool for exploring and understanding your data or creating beautiful custom charts for a project or report.
- It allows the use of standard Pandas and NumPy objects for plotting, including NumPy arrays, plain lists and Pandas series.
- In the Python visualization space, Bokeh is the most ideal candidate for building interactive and dynamic visualizations across different mediums.
The article will take you through;
- Using Bokeh to transform your data into visualizations
- Customizing your visualizations using Bokeh
- Adding interactivity to your visualizations
Installation Bokeh for Python environment requires the following commands:
conda install bokeh
pip install bokeh
There is a bokeh.sampledata module with prepared .csv and .db files with widely used datasets, for instance, Apple NASDAQ index, Airline on-time data for all flights departing etc.
In a nutshell, we will go through the process of Bokeh application creation that is a recipe for generating Bokeh documents. Typically, this is Python code run by a Bokeh server when new sessions are created.
What are the steps involved in building a visualization using Bokeh?
Preparing the data
How do you prepare data using libraries such as Numpy and Pandas to transform it into a form that is best suited for your intended visualization?
Bokeh allows the use of standard Pandas and NumPy objects for plotting. There are several Python data structures that could be used for further Bokeh visualization:
- NumPy arrays
- plain lists
- Pandas series
Let us consider Bitcoin historical data as an example of time series data preparation for visualization (Fig. 1). This dataset contains CSV files for select bitcoin exchanges for the time period of Jan 2012 to December 2020, with minute to minute updates of OHLC (Open, High, Low, Close)
Fig. 1. Bitcoin history DataFrame
NumPy arrays are used as data storage (Fig. 2)
Fig. 2. Bitcoin history to NumPy array
The resulting Bokeh plot is as follows
Fig. 3. Interactive Bokeh plot
When data is passed like this, Bokeh works behind the scenes to make a ColumnDataSource for further plotting.
At the most basic level, a ColumnDataSource is simply a mapping between column names and lists of data. The ColumnDataSource takes a data parameter which is a dict, Pandas DataFrame. If one positional argument is passed to the ColumnDataSource initializer, it will be taken as data. Once the ColumnDataSource has been created, it can be passed into the source parameter of plotting methods which allows you to pass a column’s name as a stand-in for the data values (Fig. 4).
Fig. 4. Using of the ColumnDataSource
Data preparing stage described in details in official documentation (Providing Data — Bokeh 2.2.3 Documentation).
Determining where the visualization will be rendered
At this step, you’ll determine how you want to generate and ultimately view your visualization. The plot is the key concept in Bokeh library. Plots are containers for glyphs, guides, annotations, and other tools. There are two approaches to generate and save plots: simple .html files, local or remote server application.
Application server is the most versatile and convenient way to distribute an application. In this case, various widgets could be used for input values changing. There are Callback methods that allow for updating the data for the plot on the server. These changes are automatically synced back to the browser, and the plot updates. The interactive application allows users to manipulate data and to obtain actual plots (Fig. 4), for instance Bokeh Crossfilter Example application that illustrates autompg dataset.
Fig. 4. Bokeh server application
So, Jupyter notebook is one way to create visualizations through exploratory data analysis. Alternative approach is to develop a small app that could be run locally, or that could be sent to colleagues to run locally. The Bokeh server is very useful and easy to use in this scenario.
Bokeh command bokeh serve — show myapp.py will cause a browser to open up a new tab automatically to the address of the running application. More details on server creation can be found in the official documentation Running a Bokeh Server — Bokeh 2.2.3 Documentation
Setting up the figure(s)
At this step, you’ll specify data visualization filters and plot tools: pan/drag, click/tap, scroll/pinch.
There are various interactive tools for changing plot parameters such as zoom level, range extents etc. These tools could be grouped into four categories:
- Gestures (Pan/Drag Tools, Click/Tap Tools, Scroll/Pinch Tools)
- Actions (Reset Tool)
- Inspectors (HoverTool, CrosshairTool.)
- Edit Tools
All these tools combine to the toolbar that also has parameters like toolbar_location at the figure() function.
The code of Tap Tools using is shown in Fig. 5. The Callback method returns coordinates of the point was tapped.
Fig. 5. The code of Tap Tool using
The results of Tap Tools implementations are shown in Fig. 6. The coordinates are displayed in the browser console, which can be launched with the F12 key.
Fig. 6. Tap Tools usage
More details on plot tools can be found in the official documentation Configuring Plot Tools — Bokeh 2.2.3 Documentation
Connecting to and drawing your data
Explain how to use Bokeh’s multitude of renderers to give shape to your data. We shall explore visual properties: lines, fill, text, glyphs. Bokeh provides a wide range of renderers such as circle(), square(), triangle(), asterisk(), line(), vbar() etc. as basic visual building blocks or glyphs. Example of usage of these graphic primitives is shown in Fig. 7.
Examples of scatter plot with circles for the Iris dataset and related code snippet are shown in the Bokeh gallery (iris.py — Bokeh 2.2.3 Documentation).
There are many styling visual attributes such as line, fill, text properties and so on. The list of properties is given by link Styling Visual Attributes — Bokeh 2.2.3 Documentation.
Organizing the layout
Show how to easily organize your visualizations into a tabbed layout in just a few lines of code. There are various layout options for organizing plots and widgets. Layouts allow you to manage multiple components to create interactive dashboards or data applications.
The grid of plots and widgets could be built with layout functions, for instance column(), row(), gridplot() etc. There are different sizing modes, e.g. ‘stretch_width’, ‘stretch_heigh’, ‘stretch_both’, ‘scale_width’ etc. These modes allow plots and widgets to resize based on the browser window.
For instance, layered plots for Bitcoin history dataset are shown in the Fig. 7
Fig. 7. Coding of different layer organisation
The results for grid plotting are shown in Fig. 8.
Fig. 8. Grid layer of the plots
Previewing and saving your beautiful data creation
Finally, explore your visualization, examine your customizations, and play with any interactions that you added.
There are two methods for output generating:
- output_file() for saving plots in the outer .html file
- output_notebook() for rendering directly in Jupyter notebooks
There are some methods for image exporting in addition e.g. export_svgs(), export_png(). The examples of Jupyter notebooks and .html usage are given in the previous sections. So the library provides tools for saving graphs in an external file and embedding in interactive notebooks.
In this article, we have examined the data preparation stage. Bokeh library can work with standard Python objects such as flat list, dictionary, NumPy array, Pandas DataFrame, and Series. This makes it very easy to prepare data for visualization.
We reviewed the basic visualization Bokeh methods and gave an example of code for Bitcoin history dataset. An interactive visualization tool was built as a result. It can be used as a Jupyter notebook for rapid prototyping. Saving the results to an external .html file will allow you to embed it in web applications. The Bokeh server and client applications set the library apart from standard Python rendering tools such as Matplotlib or Seaborn.
Hence, Bokeh is a good tool for interactive data visualization. It contains more tools than other libraries but simpler than frameworks such as Dash.