Plotly library for interactive data visualisation

Plotly library for interactive data visualisation
Plotly library for interactive data visualisation

Introduction

Statice plots works most of the cases. However, there are cases, we want to interact with our plots. In addition, in some cases, we also want that, end user can interact with the plots. So, todays topic is interactive data visualisation. There are a lot of good library available in python for interactive data visualisation. Plotly library is one of them. With Plotly, you can also make interactive geographic map.

We will also use Cufflinks. It works as a connector between the Pandas library and Plotly. It helps us plot interactive graphs directly using a Pandas dataframe. So, If your are familiar with pandas dataframe, then you have leaned 50% of plotly. This is the benefits of using cufflinks. Otherwise, plotly’s native code is bit complicated.

Installation

Installation of Plotly is easy. First, install plotly. After that, install Cufflinks. Don’t install plotly after installation of Cufflinks. Because, Cufflink’s support for updated version of plotly might not released yet. The installing command are given below.

$ pip install plotly
$ pip install cufflinks

Importing libraries

import seaborn as sns
import pandas as pd
import matplotlib.image as mpimg
import os
import plotly.graph_objs as go
import numpy as np
import cufflinks as cf

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
cf.go_offline()

We will import seaborn for dataset, pandas and numpy for data handling, matplotlib for image data handling and lastly, plotly and cufflinks. We will use plotly as offline mode.

Plotly is actually an online library. It hosts your data visualisations. Perhaps, it also provides an offline mode that can be used to draw interactive plots offline. For offline interactive mode to work properly inside the jupyter notebook, we need to import the download_plotlyjs, init_notebook_mode, plot, iplot from plotly.offline .

The main dataset for plotly library

dataset = sns.load_dataset('mpg')
dataset.head(5)

Most of the interactive plot will be made based on the dataset name ‘mpg’ from seaborn library. We will work on other dataset along the way later. The first 5 row the dataset is given below.

Dataset for plotly library
Dataset for plotly library

Static plot with pandas

Before interactive plots, we will plot our data with Pandas static graphs. Let’s call the plot() method on our dataframe for the static graph to observe data the data and remind ourself with the pandas simple plot method. Because, with cufflinks bindings, interactive command works the same way with the pandas dataframe. Perhaps We will plot the values for the 'mpg', 'cylinders', 'acceleration' columns. The code looks like below.

dataset2=dataset[['mpg','cylinders','acceleration']]
dataset2.plot(figsize=(10,5));
Dataset figure with pandas
Dataset static figure with pandas

Interactive plots (line) with plotly library

Now we will do the interactive plots using Plotly. To plot interactive plots using Pandas dataframe, we simply need to call the iplot() method instead of the plot method. Take a look at the following code.

dataset2.iplot()

After the simulation of above code, you will see an interactive line plot for the 'mpg', 'cylinders', 'acceleration' columns as the figure below.

Output:

Now, If you hover your mouse cursor over the plot you should see values. In addition, you can zoom in and zoom out. Moreover, you can also add and remove columns from the plot. Finally, you can also save the graph as a static image.

In today’s discussion, we will plot some of the most commonly used interactive plots using Plotly.

Bar Chart in plotly library

Regular Bar Chart

Interactive plotting in plotly can be done by iplot() function. For bar plot you need to pass “bar” as the value for the kind parameter of the iplot() function. In addition, you need to pass the list of the columns for which you want to plot graphs to the x attribute. Lastly, the  y attribute is passed. The following script plots a bar plot for the  origine  and  cilinders  columns on the x-axis and mpg on the y-axis.

dataset.iplot(kind='bar', x=['origin','cylinders'], y='mpg')

Output:

You can see from the output that four bars have been plotted for the total bill. The bars show all possible combinations of sum values for origin and cylinders columns.

Bar Char of Mean

You can also call an aggregate function e.g. (mean) on the Pandas dataframe and then call the iplot() function and pass “bar” as the value for kind attribute. For our case, if you want to plot the bar plot containing the average values for mpgcylinders and acceleration column, you can use the following script:

dataset2.mean().iplot(kind='bar')

Output:

In the output, if you hover your mouse on the bar plots, the mean values for mpgcylinders and acceleration will appear.

Horizontal bar chart

In addition to vertical bar chart, you can also do the horizontal bar chart. You just have to pass the “barh” instead of “bar” value for the kind attribute. The following code will do the vertical bar plot.

dataset2.mean().iplot(kind='barh')

Output:

Scatter plot by plotly library

Plotly also has the most common scatter interactive plot option. You just need to pass the “scatter” as a value for the  kind  parameter of the  iplot()  function. In addition, you also need to pass column names for the x and y-axis. The next script below will plot a scatter plot for the displacement  column on the x-axis and horsepower  column in the y-axis.

dataset.iplot(kind='scatter', x='displacement', y='horsepower', 
              mode='markers', xTitle='Displacement', yTitle='Horsepower')

Output:

Now, if you hover your mouse on the plot, x,y value will appear on the screen.

Box plot by plotly library

Plotly library also has the interactive box plot option. You just have to simply pass the box as value for the kind parameter of the iplot() function.

dataset2.iplot(kind='box')

Output:

If you hover you mouse on the box plot, you will see max, min, median, q1, q3.

Histogram plot

You can also use plotly library for interactive histogram plot. You just need pass the “hist” as the value for the kind attribute of the iplot() method. If you want to plot the histogram for the “mpg” column of the data, the next script will do the plot.

dataset['mpg'].iplot(kind='hist',bins=25)

Output:

Scatter matrix plot

The scatter matrix plot is actually a set of common interactive plot plotted all together as a matrix. The code below is for the interactive scatter matrix plot.

dataset2.scatter_matrix()

Output:

Spread plot

The plotly library also has the option to plot interactive spread plot. The spread plot gives the spread between two or more than two numerical columns at any particular point. For example, to see the spread between horsepower  and  mpg, you can use the spread function as below:

dataset[['mpg','horsepower']].iplot(kind='spread')

Output:

3D plot in plotly library

Plotly library has also the ability to create 3-D interactive plots. For example, to see 3D plot for  mpgcylinders  and acceleration  columns, the following code will do the work.

dataset2.iplot(kind='surface',colorscale='rdylbu')

Output:

Heatmap plot

There is also an interesting interactive heatmap plot option in plotly library. We just need to pass the “heatmap” as a value to the kind attribute of the iplot() method. The code for heatmap plot is given below.

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/volcano.csv')
df.iplot(kind='heatmap',colorscale='spectral')

Output:

Image plot with heatmap

Plotly’s latest release has the option to plot interactive image. But, the problem is, cufflinks latest release, still doesn’t support the latest version of plotly. Older version of plotly, doesn’t have the option to plot image. But there is a way around. We can plot single band image using heatmap option of plotly. The image may not exactly look like the actual image, as we are not using all 3 band of RGB. The script of plotting image with heatmap option is given below. Github link of all project file and all script will be given at the end of the tutorial.

url = 'https://github.com/hasanbdimran/Plotly-libray-tutorial-with-cufflinks/blob/master/myAvatar_rz.png?raw=true'
image = mpimg.imread(url)
test_image=image.copy()[:,:,0].T
test_image=test_image[:,::-1]
df_im = pd.DataFrame(test_image)
df_im.iplot(kind='heatmap',colorscale='-ylgnbu')

Output:

Geographic Map with plotly library

Interactive geographic map is a very important feature of plotly library. But geographic map can not be done with cufflink support. So, we have to do it with, native plotly language. For this we have to import the plotly library offline without cufflink support. The code is given below.

import plotly.plotly as py
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

init_notebook_mode(connected=True)
import pandas as pd

1. Map data dictionary

First step to make a geographic map is to create a map data dictionary. This data dictionary will include the information of the map type, location, location mode, colorscale, text, z, colorbar. Example of a map dictionary is given below.

map_data = dict(type='choropleth',
            locations=['MI', 'CO', 'FL', 'IN'],
            locationmode='USA-states',
            colorscale='Portland',
            text=['Michigan', 'Colorado', 'Florida', 'Indiana'],
            z=[1.0,2.0,3.0,4.0],
            colorbar=dict(title="USA States")
           )

2. Map layout

The second step is to create a map layout dictionary. This dictionary will include the information about whole map location layout. For our example case, the layout dictionary is given below.

map_layout = dict(geo = {'scope':'usa'})

3. Graph object

If you look at the section where we imported the libraries, we imported the plotly.graph_objs class. The third step is to create an object of this graph. The object takes two parameters: data and layout. We will pass our map data dictionary to the first parameter and the map layout dictionary to the second parameter. the code is given below.

map_actual = go.Figure(data=[map_data], layout=map_layout)

4. Calling iplot() method

The last step is to call the iplot() function and pass it the graph object that we have created in the third step. The code is given below.

iplot(map_actual)

Output:

In the map above, you will see the geographic map for four US states. You will also see that, other states are empty, as their corresponding information has not been provided. If you hover the mouse over the colored states, you will see the corresponding values of the text  and z  keys that is specified in data dictionary.

Geographical choropleth US map

Now we have a basic idea about how to plot interactive geographical map by plotly library. We will apply our knowledge and make a choropleth US map based on the total agricultural export of each states. The data and the script can be found on the plotly’s can be found on ploty’s documentation website.

dft5 = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2011_us_ag_exports.csv')
dft5.head()
Agricultural export of US
Satewise agricultural export of US

The code below will produce the interactive geographical choropleth US map based on agricultural exports. In map data map type is ‘choropleth’. locations is selected from the dataframe column ‘code’ of state. locationmode is ‘USA-states’. text parameter taken to pass the names of the states from the dataframe column ‘state’. The main information z is taken from the ‘total exports’ column of the dataframe.

map_data = dict(type='choropleth',
            locations=dft5['code'],
            locationmode='USA-states',
            colorscale='YlGnBu',
            text=dft5['state'],
            marker=dict(line=dict(color='rgb(255,0,0)', width=2)),
            z=dft5['total exports'],
            colorbar=dict(title="Total exports <br> Millions USD")
           )
map_layout = dict(title='USA States Agricultural exports of 2011',
              geo=dict(scope='usa',
                         showlakes=True,
                         lakecolor='rgb(85,173,240)')
             )
map_actual = go.Figure(data=[map_data], layout=map_layout)
iplot(map_actual)

Output:

Geographical choropleth world map

Now we will plot choropleth map of world GDP. The GDP data is collected from plotly’s documentation website. Basic steps are the same as above. The GDP data is loaded first by pandas.

dfw = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_world_gdp_with_codes.csv')
GDP data
GDP data

The code below will execute the choropleth world map based on GDP. In map_data dictionary locations is loaded from dataframe column of ‘CODE’. locationmode is ‘ISO-3’ for the world map. Overall there are three types of locationmode available ‘country names’, ‘ISO-3’ and ‘USA-states’. text is loaded from ‘COUNTRY’ column of the dataframe. The main GDP data is passed in z from the dataframe column of ‘GDP (BILLIONS)’. In the map_layout the scope is set to ‘world’. Rest of the code is self explanatory.

map_data = dict(type='choropleth',
            locations=dfw['CODE'],
            locationmode='ISO-3', # country names, ISO-3, USA-states
            colorscale='YlGnBu',
            text=dfw['COUNTRY'],
            marker=dict(line=dict(color='darkgray', width=0.5)),
            z=dfw['GDP (BILLIONS)'],
            colorbar=dict(title="GDP<br>Billions US$")
           )
map_layout = dict(title='2014 Global GDP',
              geo=dict(scope='world',
                         showlakes=True)
             )
map_actual = go.Figure(data=[map_data], layout=map_layout)
iplot(map_actual, show_link=True)

Output:

Concluding remarks

The interactive plotting library plotly is very strong library. In addition, cufflinks support for pandas makes it very efficient. Most of the plot here is based on cufflink supported code except the geographic map part. There are also very interesting tutorial available on the internet. As for example, I have learned plotly following plotly documentation and a blog post. All the link is given below.

Plotly open source library

https://plot.ly/python/

Another interesting blogpost on plotly

https://stackabuse.com/using-plotly-library-for-interactive-data-visualization-in-python/

Github repository link

https://github.com/hasanbdimran/Plotly-libray-tutorial-with-cufflinks

Leave a Reply