Preparing to Plot
Preparing to Plot¶
In this workshop, we will use
matplotlib to visualize Hydrogen Atom orbitals. If you’ve taken an introductory quantum chemistry class, you will have seen visualizations of these before, most likely in your textbook.
You can see some visualizations here. Even if you haven’t yet taken quantum chemistry, the shapes of the s and p orbitals will probably be familar to you from your introductory chemistry classes.
We will be working with pre-calculated data that is in text files. As part of the set-up you should have downloaded these files, and we will explain what is in them as we visualize them. For the purpose of this workshop, it’s not important that you have a deep understanding of the data, or that you understand how it was calculated.
Reading Data using Pandas¶
In order to plot and create visualizations with our data, we first have to get it into a form that python can recongize and work with. We will be using the python library pandas. To read in our text files. Pandas is a library that is very widely used in data science.
pandas, you can read data into python and work with your data. The data we want to work with are in
.csv files, or
comma separated value files. The first file we will work with is called
s_orbitals_1D.csv. This text file contains the value of the 1s, 2s and 3s orbitals in the xy plane for different values of x. If you examine this in a text editor, you will see there is a header, rows, and that values are separated by commas.
Pandas can easily read this type of file using a function called
read_csv. You can also open files like this in Excel, or even save csv files from Excel.
First, we will need to import pandas. The pandas library is usually shortened to
pd. We will use the function
pd.read_csv to read the csv file. We give the
read_csv function the path to the file we want to read.
import pandas as pd s_orbitals = pd.read_csv("s_orbitals_1D.csv")
Our data is now in the variable called
s_orbitals. This variable is something called a pandas dataframe. It resembles a spreadsheet - it has rows and columns. We can see a preview of what is in the variable by using
s_orbitals.head(). It will show us the first five rows.
You can see from the above preview that we have something that resembles a spreadsheet. Our first column contains x values, while the following columns contain values for the
3s orbitals. We are going to plot these eventualy, but first we have to understand a little more about how we can use the data we have read in.
A Brief Introduction to Dataframes¶
As stated previously, pandas stores data in rows and columns. You will see above that the rows are numbered and the columns have names. There are a few ways we can access information in a dataframe.
To get a column, we use the syntax
For example, to get the
r column of the dataframe we have read in, we would put the column name (“x”). It is very important that this column name be in quotes and that capitalization match what is in the dataframe.
0 0.000000 1 0.517241 2 1.034483 3 1.551724 4 2.068966 5 2.586207 6 3.103448 7 3.620690 8 4.137931 9 4.655172 10 5.172414 11 5.689655 12 6.206897 13 6.724138 14 7.241379 15 7.758621 16 8.275862 17 8.793103 18 9.310345 19 9.827586 20 10.344828 21 10.862069 22 11.379310 23 11.896552 24 12.413793 25 12.931034 26 13.448276 27 13.965517 28 14.482759 29 15.000000 Name: r, dtype: float64
We can also “slice” the dataframe to get only a portion of it. We would do this if we wanted only some of the rows, for example. The following syntax shows how to get information if we want to use row and column numbers.
We use the same slicing syntax that we use for python lists and numpy arrays. For example, to get the first 10 rows of the first two columns:
Pandas is a very useful library for manipulating data in python. We won’t be using it extensively for this workshop, but we will need to use it a little to select the data which we would like to plot. If you would like to learn more about the capabilities of pandas, see this lesson from The Molecular Sciences Software Institute.