Interactive Visualizations with plotly
Contents
Interactive Visualizations with plotly#
So far, we have learned about two visualization libraries - matplotlib and seaborn. A third option we will discuss is the plotting library plotly.
Plotly can be used to create interactive visualizations. It is a newer library than matplotlib, and it doesn’t have quite as many options for statistical graphs as seaborn. However, it is gaining more of a userbase every day, and is nice to use in Jupyter notebooks because the graphs are all automatically interactive. Plotly also has a javascript interface, and can be used with plotly-dash
to create web apps.
In this section, we will be making the same visualizations we made in the seaborn lesson. In that lesson, we loaded data we cleaned from the paper
Potts, R.O., Guy, R.H. A Predictive Algorithm for Skin Permeability: The Effects of Molecular Size and Hydrogen Bond Activity. Pharm Res 12, 1628–1633 (1995). https://doi.org/10.1023/A:1016236932339
To visualize with plotly, we will import plotly express
.
import os
import pandas as pd
import plotly.express as px
file_path = os.path.join("data", "potts_table1_clean.csv")
df = pd.read_csv(file_path)
df.head()
Compound | log P | pi | Hd | Ha | MV | R_2 | log K_oct | log K_hex | log K_hep | |
---|---|---|---|---|---|---|---|---|---|---|
0 | water | -6.85 | 0.45 | 0.82 | 0.35 | 10.6 | 0.00 | -1.38 | NaN | NaN |
1 | methanol | -6.68 | 0.44 | 0.43 | 0.47 | 21.7 | 0.28 | -0.73 | -2.42 | -2.80 |
2 | methanoicacid | -7.08 | 0.60 | 0.75 | 0.38 | 22.3 | 0.30 | -0.54 | -3.93 | -3.63 |
3 | ethanol | -6.66 | 0.42 | 0.37 | 0.48 | 31.9 | 0.25 | -0.32 | -2.24 | -2.10 |
4 | ethanoicacid | -7.01 | 0.65 | 0.61 | 0.45 | 33.4 | 0.27 | -0.31 | -3.28 | -2.90 |
Creating Scatter Plots#
First, we will create a scatter plot of log P
vs pi
. This is accomplished using the px.scatter
command. Plotly-express works with pandas dataframes. We must pass a pandas dataframe and indicate the column names for x
and y
.
When working with plotly, you will capture the output of this function as a variable. When we want to see the plot, we use variable_name.show()
. This does not have to be done in the same cell as figure creation.
fig = px.scatter(df, x='pi', y='log P')
fig.show()
You will notice in the figure above that you can hover your mouse over the data points and see information about the points. In the upper right corner of the figure you will find a set of buttons which will allow you to select different options for interacting with the graph.
Visualizing Linear Relationships#
The scatter function has the ability to add a trendline built-in. If you would like a linear fit to be performed between x
and y
, add the argument trendline='ols'
to perform an ordinary least squares fit. Under the hood, plotly will call statsmodels
to perform an ordinary least squares fit and will add a line with this fit to the plot. If you want to perform the fit using scikit-learn you will have to manually do the fit an add the line yourself. The trendline argument can also be set to lowess
for a ‘locally weighted scatterplot smoothing line’.
fig = px.scatter(
df, x='pi', y='log P', trendline='ols', trendline_color_override='darkblue')
fig.show()
To make a figure with subplots which shows all of the variables, the data has to be in long form, similar to when we created a plot with lmplot
in seaborn. Then, we add another argument to px.scatter
- facet_col
which will make a new plot for each new value in the variable
column.
# Get columns which are numbers - this is the same processing as seaborn.
df2 = df.select_dtypes(include="float")
df2_melt = df2.melt(id_vars="log P")
df2_melt.head()
log P | variable | value | |
---|---|---|---|
0 | -6.85 | pi | 0.45 |
1 | -6.68 | pi | 0.44 |
2 | -7.08 | pi | 0.60 |
3 | -6.66 | pi | 0.42 |
4 | -7.01 | pi | 0.65 |
fig = px.scatter(df2_melt, x="value", y="log P", facet_col="variable",
trendline='ols',
trendline_color_override='darkblue')
fig.show()
By default, the x and y axes will be on the same scale. In this particular case, we do not want the x-axis to be on the same scale. Add an additional argument after figure creation (fig.update_xaxes(matches=None)
to make the axes be on different scales.
The argument facet_col_wrap
can be used to specify how many columns shoould be in the figure. The plots will be wrapped into rows using this number of columns. Finally, we add arguments for height and width to make the plot have a better size.
fig = px.scatter(df2_melt, x="value", y="log P", facet_col="variable", facet_col_wrap=2,
trendline='ols', trendline_color_override='darkblue', height=800, width=600)
fig.update_xaxes(matches=None)
fig.show()
Correlation Plots#
We can visualize correlation plots using imshow
.
corr = df.corr()
heatmap = px.imshow(corr)
heatmap.show()
heatmap = px.imshow(corr.iloc[:6, :6])
heatmap.show()
Plotly Color Schemes#
The following section demonstrates using the help function to find more information about available color schemes in plotly express.
help(px.colors)
Help on package plotly.express.colors in plotly.express:
NAME
plotly.express.colors - For a list of colors available in `plotly.express.colors`, please see
DESCRIPTION
* the `tutorial on discrete color sequences <https://plotly.com/python/discrete-color/#color-sequences-in-plotly-express>`_
* the `list of built-in continuous color scales <https://plotly.com/python/builtin-colorscales/>`_
* the `tutorial on continuous colors <https://plotly.com/python/colorscales/>`_
Color scales are available within the following namespaces
* cyclical
* diverging
* qualitative
* sequential
PACKAGE CONTENTS
FUNCTIONS
color_parser(colors, function)
Takes color(s) and a function and applies the function on the color(s)
In particular, this function identifies whether the given color object
is an iterable or not and applies the given color-parsing function to
the color or iterable of colors. If given an iterable, it will only be
able to work with it if all items in the iterable are of the same type
- rgb string, hex string or tuple
colorscale_to_colors(colorscale)
Extracts the colors from colorscale as a list
colorscale_to_scale(colorscale)
Extracts the interpolation scale values from colorscale as a list
convert_colors_to_same_type(colors, colortype='rgb', scale=None, return_default_colors=False, num_of_defualt_colors=2)
Converts color(s) to the specified color type
Takes a single color or an iterable of colors, as well as a list of scale
values, and outputs a 2-pair of the list of color(s) converted all to an
rgb or tuple color type, aswell as the scale as the second element. If
colors is a Plotly Scale name, then 'scale' will be forced to the scale
from the respective colorscale and the colors in that colorscale will also
be coverted to the selected colortype. If colors is None, then there is an
option to return portion of the DEFAULT_PLOTLY_COLORS
:param (str|tuple|list) colors: either a plotly scale name, an rgb or hex
color, a color tuple or a list/tuple of colors
:param (list) scale: see docs for validate_scale_values()
:rtype (tuple) (colors_list, scale) if scale is None in the function call,
then scale will remain None in the returned tuple
convert_colorscale_to_rgb(colorscale)
Converts the colors in a colorscale to rgb colors
A colorscale is an array of arrays, each with a numeric value as the
first item and a color as the second. This function specifically is
converting a colorscale with tuple colors (each coordinate between 0
and 1) into a colorscale with the colors transformed into rgb colors
convert_dict_colors_to_same_type(colors_dict, colortype='rgb')
Converts a colors in a dictioanry of colors to the specified color type
:param (dict) colors_dict: a dictioanry whose values are single colors
convert_to_RGB_255(colors)
Multiplies each element of a triplet by 255
Each coordinate of the color tuple is rounded to the nearest float and
then is turned into an integer. If a number is of the form x.5, then
if x is odd, the number rounds up to (x+1). Otherwise, it rounds down
to just x. This is the way rounding works in Python 3 and in current
statistical analysis to avoid rounding bias
:param (list) rgb_components: grabs the three R, G and B values to be
returned as computed in the function
find_intermediate_color(lowcolor, highcolor, intermed, colortype='tuple')
Returns the color at a given distance between two colors
This function takes two color tuples, where each element is between 0
and 1, along with a value 0 < intermed < 1 and returns a color that is
intermed-percent from lowcolor to highcolor. If colortype is set to 'rgb',
the function will automatically convert the rgb type to a tuple, find the
intermediate color and return it as an rgb color.
hex_to_rgb(value)
Calculates rgb values from a hex color code.
:param (string) value: Hex color string
:rtype (tuple) (r_value, g_value, b_value): tuple of rgb values
label_rgb(colors)
Takes tuple (a, b, c) and returns an rgb color 'rgb(a, b, c)'
make_colorscale(colors, scale=None)
Makes a colorscale from a list of colors and a scale
Takes a list of colors and scales and constructs a colorscale based
on the colors in sequential order. If 'scale' is left empty, a linear-
interpolated colorscale will be generated. If 'scale' is a specificed
list, it must be the same legnth as colors and must contain all floats
For documentation regarding to the form of the output, see
https://plot.ly/python/reference/#mesh3d-colorscale
:param (list) colors: a list of single colors
n_colors(lowcolor, highcolor, n_colors, colortype='tuple')
Splits a low and high color into a list of n_colors colors in it
Accepts two color tuples and returns a list of n_colors colors
which form the intermediate colors between lowcolor and highcolor
from linearly interpolating through RGB space. If colortype is 'rgb'
the function will return a list of colors in the same form.
named_colorscales()
Returns lowercased names of built-in continuous colorscales.
unconvert_from_RGB_255(colors)
Return a tuple where each element gets divided by 255
Takes a (list of) color tuple(s) where each element is between 0 and
255. Returns the same tuples where each tuple element is normalized to
a value between 0 and 1
unlabel_rgb(colors)
Takes rgb color(s) 'rgb(a, b, c)' and returns tuple(s) (a, b, c)
This function takes either an 'rgb(a, b, c)' color or a list of
such colors and returns the color tuples in tuple(s) (a, b, c)
validate_colors(colors, colortype='tuple')
Validates color(s) and returns a list of color(s) of a specified type
validate_colors_dict(colors, colortype='tuple')
Validates dictioanry of color(s)
validate_colorscale(colorscale)
Validate the structure, scale values and colors of colorscale.
validate_scale_values(scale)
Validates scale values from a colorscale
:param (list) scale: a strictly increasing list of floats that begins
with 0 and ends with 1. Its usage derives from a colorscale which is
a list of two-lists (a list with two elements) of the form
[value, color] which are used to determine how interpolation weighting
works between the colors in the colorscale. Therefore scale is just
the extraction of these values from the two-lists in order
DATA
DEFAULT_PLOTLY_COLORS = ['rgb(31, 119, 180)', 'rgb(255, 127, 14)', 'rg...
PLOTLY_SCALES = {'Blackbody': [[0, 'rgb(0,0,0)'], [0.2, 'rgb(230,0,0)'...
__all__ = ['named_colorscales', 'cyclical', 'diverging', 'sequential',...
FILE
/home/janash/miniconda3/envs/python-scripting-2/lib/python3.9/site-packages/plotly/express/colors/__init__.py
help(px.colors.diverging)
Help on module _plotly_utils.colors.diverging in _plotly_utils.colors:
NAME
_plotly_utils.colors.diverging
DESCRIPTION
Diverging color scales are appropriate for continuous data that has a natural midpoint other otherwise informative special value, such as 0 altitude, or the boiling point
of a liquid. The color scales in this module are mostly meant to be passed in as the `color_continuous_scale` argument to various functions, and to be used with the `color_continuous_midpoint` argument.
FUNCTIONS
swatches(template=None)
Parameters
----------
template : str or dict or plotly.graph_objects.layout.Template instance
The figure template name or definition.
Returns
-------
fig : graph_objects.Figure containing the displayed image
A `Figure` object. This figure demonstrates the color scales and
sequences in this module, as stacked bar charts.
DATA
__all__ = ['swatches']
FILE
/home/janash/miniconda3/envs/python-scripting-2/lib/python3.9/site-packages/_plotly_utils/colors/diverging.py
px.colors.diverging.swatches()
heatmap = px.imshow(corr.iloc[:6, :6], color_continuous_scale="RdBu", color_continuous_midpoint=0)
heatmap.show()
Saving Images#
heatmap.write_image("correlation.png") # to save png
heatmap.write_html("correlation.html") # to save html