Introduction

Overview

Questions

  • What is the basic syntax of the python programming language?

Objectives:

  • Assign values to variables
  • Use the print function to check how the code is working.
  • Use multiple assignment to assign several variables at once.
  • Use a for loop to perform the same action on the items in a list

Getting Started

Python is a computer programming language that has become ubiquitous in scientific programming. Our lessons will run python interactively through a python interpreter inside a Jupyter notebook. The setup page should have provided information on how to install and start a Jupyter notebook. Everything included in a code block is something you could type into your python interpreter and evaluate.

Setting up your Jupyter notebooks

In the setup, you learned how to start a Jupyter notebook. Now, we will use the notebook to execute Python code. Jupyter notebooks are divided into cells. You run a Jupyter notebook one cell at a time. To execute a cell, click inside the cell and press shift+enter.

In the upper left corner, click where it says “Untitled” and change the name to “MolSSI Workshop”. We have now changed the name of the Jupyter Notebook.

Jupyter notebooks allow us to also use something called markdown in some cells. We can use markdown to write descriptions about our notebooks for others to read. It’s a good practice to have your first cell be markdown to explain the purpose of the notebook. Let’s do that in our first cell. Click inside the first cell, then on the top of the screen select Cell->Cell Type->Markdown (shown below).

Now, return to the cell and type the following:

# MolSSI Workshop
## Introduction

This lesson covers Python basics like variable creation
and assignment and using the Jupyter notebook

In Markdown, we create headers using a single # sign. Using two (##) creates a subheader. After typing this into a cell, press shift+enter to evaluate. Now your notebook should look like the following.

Now that our notebook is set-up, we’re ready to start learning some Python!

Assigning variables and data types

Any python interpreter can work just like a calculator. This is not very useful. Type the following into the next cell of your Jupyter notebook.

3+7
10

Here, Python has performed a calculation for us. To save this value, or other values, we assign them to a variable for later use. The syntax for assigning variables is the following:

variable_name = variable_value

Let’s see this in action with a calculation. Type the following into the next cell of your Jupyter notebook.

# Calculations using the Michaelis-Menten Equation
Km = 15.0                # Km = 15 micromolar
Vmax = 100.0             # Vmax = 100.0 nanomoles/sec
substrate_concentration = 8.0                  # Substrate concentration is 8.0 micromolar
velocity = Vmax * substrate_concentration/(Km + substrate_concentration)    # Michaelis-Menten equation

Notice several things about this code. You can use # to add comments to your code, both at the start of a line and in the middle of the line (then the rest of the line is a comment). The computer does not do anything with these comments. They have been used here to remind the user what units are used for each of their values. Comments are also often used to explain what the code is doing or leave information for future people who might use the code.

When choosing variable names, you should choose informative names so that someone reading your code can tell what they represent. Naming a variable temp or temperature is much more informative than naming that variable t.

We can now access any of the variables from other cells. Let’s print the value that we calculated. In the next cell,

print(velocity)
34.78260869565217

In the previous code block, we introduced the print() function. Often, we will use the print function just to make sure our code is working correctly.

Note that if you do not specify a new name for a variable, then it doesn’t automatically change the value of the variable; this is called being immutable. For example if we typed

print(velocity)
velocity*1000
print(velocity)
34.78260869565217
34.78260869565217

Nothing happened to the value of velocity. If we wanted to change the value of velocity we would have to re-save the variable using the same name to overwrite the existing value.

print(velocity)
velocity = velocity * 60
print(velocity)
34.78260869565217
2086.9565217391305

There are situations where it is reasonable to overwrite a variable with a new value, but you should always think carefully about this. Usually it is a better practice to give the variable a new name and leave the existing variable as is.

# change velocity back
velocity = velocity / 60
v_nmols_per_min = velocity * 60
print(velocity)
print(v_nmols_per_min)
34.78260869565217
2086.9565217391305

Assigning multiple variables at once

Python can do what is called multiple assignment where you assign several variables their values on one line of code. The following code block does the exact same thing as the previous code block.

# I can assign all these variables at once
Km, vmax, substrate_concentration = 15.0, 100.0, 8.0
velocity = vmax * substrate_concentration/(Km + substrate_concentration)
print(velocity)
34.78260869565217

Data types

Each variable is some particular type of data. The most common types of data are strings (str), integers (int), and floating point numbers (float). You can identify the data type of any variable with the function type(variable_name).

type(velocity)
float

You can change the data type of a variable like this. This is called casting.

velocity_string = str(velocity)
type(velocity_string)
str

Lists

Another common data structure in python is the list. Lists can be used to group several values or variables together, and are declared using square brackets [_]. Python assigns special meanings to square brackets [], parentheses () and curly brackets {}, so you must be very careful with these characters. List values are separated by commas. Python has several built in functions which can be used on lists. The built-in function len can be used to determine the length of a list. This code block also demonstrates how to print multiple variables.

# This is a list
substrate_concs = [1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0] #micromolar
# I can determine its length
s_length = len(substrate_concs)
# Print the length of the list
print('This list contains', s_length, 'substrate concentrations')
This list contains 13 substrate concentrations

If you want to operate on a particular element of the list, you use the list name and then put in brackets which element of the list you want. In python counting starts at zero. So the first element of the list is list[0]

# Print the first element of the list
print(substrate_concs[0])
1.0

You can use an element of a list as a variable in a calculation.

# Convert the last substrate concentration to nM
concentration_nM = substrate_concs[12] * 1000
print(concentration_nM)
100000.0

Slices

Sometimes you will want to make a new list that is a subset of an existing list. For example, we might want to make a new list that is just the first few elements of our previous list. This is called a slice. The general syntax is

new_list = list_name[start:end]

When taking a slice, it is very important to remember how counting works in python. Remember that counting starts at zero so the first element of a list is list_name[0]. When you specify the last element for the slice, it goes up to but not including that element of the list. So a slice like

short_list = substrate_concs[0:3]

includes substrate_concs[0], substrate_concs[1] and substrate_concs[2] but not substrate_concs[3].

print(short_list)
[1.0, 2.0, 3.0]

If you do not include a start index, the slice automatically starts at list_name[0]. If you do not include an end index, the slice automatically goes to the end of the list.

Check your understanding

What does the following code print?

slice1 = substrate_concs[8:]
slice2 = substrate_concs[:3]
print('slice1 is', slice1)
print('slice2 is', slice2)

If you don’t specify a new variable name nothing happens. Looking at our example above if we only slice the list, nothing happens to substrate_concs.

print(substrate_concs)
substrate_concs[0:3]
print(substrate_concs)
[1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0]
[1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0]

Repeating an operation many times: for loops

Often, you will want to do something to every element of a list. The structure to do this is called a for loop. The general structure of a for loop is

for variable in list:
    do things using variable

Indentation is very important in python. There is nothing like an end or exit statement that tells you that you are finished with the loop. The indentation shows you what statements are in the loop. Let’s use a loop to calculate initial velocities for all the substrate concentration in our S list.

substrate_concs = [1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0]
for number in substrate_concs:
    velocity = Vmax * number / (Km + number)
    print(velocity)
6.25
11.764705882352942
16.666666666666668
21.05263157894737
28.571428571428573
34.78260869565217
40.0
50.0
57.142857142857146
66.66666666666667
76.92307692307692
83.33333333333333
86.95652173913044

Now it seems like we are really getting somewhere with our program! But it would be even better if instead of just printing the values, it saved them in a new list. To do this, we are going to use the append function. The append function adds a new item to the end of an existing list. The general form of the append function is

list_name.append(new_thing)

Try running this block of code. See if you can figure out why it doesn’t work.

substrate_concs = [1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0]
for number in substrate_concs:
    V = Vmax * number / (Km + number)
    velocities.append(V)
    
print(velocities)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/tmp/ipykernel_1865/252774083.py in <module>
      2 for number in substrate_concs:
      3     V = Vmax * number / (Km + number)
----> 4     velocities.append(V)
      5 
      6 print(velocities)

NameError: name 'velocities' is not defined

This code doesn’t work because on the first iteration of our loop, the list velocities doesn’t exist. To make it work, we have to start the list outside of the loop. The list can be blank when we start it, but we have to start it.

velocities = []
substate_concs = [1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0]
for number in substrate_concs:
    V = Vmax * number / (Km + number)
    velocities.append(V)
    
print(velocities)
[9.090909090909092, 16.666666666666668, 23.076923076923077, 28.571428571428573, 37.5, 44.44444444444444, 50.0, 60.0, 66.66666666666667, 75.0, 83.33333333333333, 88.23529411764706, 90.9090909090909]

Making choices: Logic Statements

Within your code, you may need to evaluate a variable and then do something if the variable has a particular value. This type of logic is handled by an if statement. In the following example, we are only going to use the lower substrate concentrations, as we might do when we are looking for the initial linear portion of a Michaelis-Menten curve. As a biochemist, you may want to plot the data to see what you find. We will address plotting later in the workshop

Km, Vmax = 15.0, 100.0
linear_MM = []

subtrate_concs = [1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0]
for number in substrate_concs:
    if number < Km/4:
        V_linear = Vmax * number / (Km + number)
        linear_MM.append(V_linear)

print(linear_MM)
[6.25, 11.764705882352942, 16.666666666666668]

Other logic operations include

  • equal to ==

  • not equal to !=

  • greater than >

  • less than <

  • greater than or equal to >=

  • less than or equal to <=

You can also use and, or, and not to check more than one condition.

substrate_concs = [1.0, 2.0, 3.0, 4.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 50.0, 75.0, 100.0]
v_at_or_below_km = []
for number in substrate_concs:
    if number <= Km or number == Km:
        velocity = Vmax * number / (Km + number)
        v_at_or_below_km.append(velocity)

print(v_at_or_below_km)
[6.25, 11.764705882352942, 16.666666666666668, 21.05263157894737, 28.571428571428573, 34.78260869565217, 40.0, 50.0]

If you are comparing strings, not numbers, you use different logic operators like is, in, or is not.

Exercise

The following list contains some floating point numbers and some numbers which have been saved as strings. Copy this list exactly into your code.

conc_list = ['1.0', 2.0, 5.0, '14.0', 20.0]

Set up a `for` loop to go over each element of `S_list`. If the element is a string (`str`), recast it as a float. Save *all* of the numbers to a new list called `number_list`. Pay close attention to your indentation!

A note about jupyter notebooks

If you use the jupyter notebook for your python interpreter, the notebook only executes the current code block. This can have several unintended consequences. If you change a value and then go back and run an earlier code block, it will use the new value, not the first defined value, which may give you incorrect analysis. Similarly, if you open your jupyter notebook later, and try to run a code block in the middle, it may tell you that your variables are undefined, even though you can clearly see them defined in earlier code blocks. But if you didn’t re-run those code blocks, then python doesn’t know they exist.