This lesson is still being designed and assembled (Pre-Alpha version)

Python Data Types

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • How can I store data in Python variables?

Objectives
  • Be able to name and initialize different built-in data types of Python

If you completed the material in the Python Data and Scripting workshop, you learned of a few data types. Those were int (integers), float (floating point numbers), str (strings), and lists.

This lesson will review those data types, and talk about a few additional ones we will be using throughout the week.

Numeric and Text Data Types - integers, floats, and strings

As a review, here is how you would initialize variables of each type. If a number does not have a decimal, it will be an integer. A string is created by surrounding a value with either single (') or double (") quotes, as shown below.

a = 1
print(F'The type of variable `a` is {type(a)}')

# A float is initialized with a decimal
b = 1.
print(F'The type of variable `b` is {type(b)}')

# A string is initialized with quotation marks 
c = '1'
print(F'The type of variable `c` is {type(c)}')
The type of variable `a` is <class 'int'>
The type of variable `b` is <class 'float'>
The type of variable `c` is <class 'str'>

Floats and integers are numeric data types. For example, in the code above, a and b can be added because they are both numeric.

a + b
2.0

However, you can not add a numeric and non-numeric type.

a + c
TypeError: unsupported operand type(s) for +: 'int' and 'str'

Review

How would you change the expression a + c so that the numeric values of a and c could be added?

Answer

This would require changing c to a numeric type. You could cast it as integer or float.

a + int(c)

You could have also used float(c) here. This will change the variable stored in c to a number if possible.

In addition to floats and integers, Python also supports complex numbers. We will not talk much about these, but you can create them by appending a ‘j’ or ‘J’ to a number.

# Create a complex number
complex_number = 1.0J
complex_number ** 2

More about strings

Strings are created using double or single quotes.

my_string = 'This is a test string'

Recall that strings and numeric types cannot be added. For example,

a + c

will give a Type Error.

However, you can use the + operator on two strings. This results in string concatenation.

c = '1'
d = '0'
e = c + d
print(e)
'10'

Check your understanding

Predict the output of each print statement.

print(10 + 20)
print('10' + '20')
print('10' + 20)

Answer

30
1020
TypeError: can only concatenate str (not "int") to str

The first statement added two numeric types, meaning addition was performed. For the second statement, the + operator was performed on two strings, meaning that the strings were concatenated. The third statement results in a TypeError, beause a string and numeric type cannot be added.

Accessing string elements

Strings can be indexed by using square brackets, and giving an element number (similar to lists). For example, to access the first element in the string

print(F'The first letter in the string is {my_string[0]}')
T

You can also check if strings contain substrings using the following syntax.

'substring' in 'string'

For example,

print(my_string)
print('test' in my_string)
'This is a test string'
True

Grouping data together - lists, tuples, and dictionaries

Lists

There are several built-in data structures in Python which can be used to group similar data together. One commonly used data type which was covered extensively in the Python Data and Scripting Lesson is the list.

# A list is initialized with square brackets
my_list = []

# When elements are present in a list, they are separated by commas
my_list = [1, 2, 3, 4]

List has several built-in methods. These methods are accessed by adding a dot (.) and the method name after the list. In Python, we can see all of the methods associated with an object by using the dir command.

dir(my_mixed_list)
['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

Pay attention to the methods at the bottom which do not begin with __. These are methods which you can use on a list variable (the others are special methods we will learn about later).

In particular, you will see append, which we used in the Python Data and Scripting workshop. Recall that append adds another list element at the end of the list.

There are several other methods as well. These methods will act on the list, and may modify the list. For example, the .sort() method will modify the list in place to sort values from lowest to highest.

numeric_list = [1, 2, 100, 20, 5]
print(F'Unsorted list : {numeric_list}')
numeric_list.sort()
print(F'Sorted list : {numeric_list}')
[1, 2, 100, 20, 5]
Sorted list : [1, 2, 5, 20, 100]

Tuples

Tuples are another data type which seem very similar to lists.

# A tuple is initialized with parenthesis, with values separated by commas
my_tuple = (1, 2, 3, 4)
dir(my_tuple)
['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'count',
 'index']

Tuples have much fewer methods associated with them.

However, unlike lists, tuples cannot be changed after being created.

my_tuple[0] = 0
TypeError: 'tuple' object does not support item assignment

Iteration - Tuples and Lists

You can iterate through the items in both tuples and lists using a for loop.

Let’s consider this block of code.

# This is a list
energy_kcal = [-13.4, -2.7, 5.4, 42.1]
# I can determine its length
energy_length = len(energy_kcal)

# print the list length
print('The length of this list is', energy_length)

Exercise

Use a for loop to iterate through the variable energy_kcal. Create a new list, energy_kJ which contains the values in kJ. Hint: 1 kJ = 4.184* (1 kcal)

Answer

energy_kJ = []
for value in energy_kcal:
     energy_kJ.append(value*4.184)

We first create an empty list (energy_kJ), then iterate through the energy_kcal list using a for loop. We append each new calculated value to energy_kJ.

Dictionaries

The last data type we will discuss is dictionaries. Dictionaries are data structures which allow you to store values using key, value pairs.

# An empty dictionary is initialized with curly braces.
my_dictionary = {}
type(my_dictionary)

We can use a dictionary to group data together.

benzene_molecule = {
    'name' : 'benzene',
    'formula' : 'C6H6',
    'molecular_weight' : 78.11,
}

You create data in a dictionary when the dictionary is initialized by first naming a key, then its value separated by a colon (:). We can access data associated with the keys by using the dictionary name and the key of interest in square brackets. For example, to access the formula of benzene_molecule,

benzene_molecule['formula']
'C6H6'

Check your understanding

How would you access the molecular weight of our benzene molecule?

Answer

benzene_molecule['molecular_weight']
78.11

We access the molecular weight by using the “molecular_weight” keyword.

If we want to add another key value pair to an existing dictionary, we do so in a way similar to assigning a variable, except that we put the new keyword in square brackets after the dictionary name.

benzene_molecule['melting_point'] = 5.5
print(benzene_molecule)
{'name': 'benzene', 'formula': 'C6H6', 'molecular_weight': 78.11, 'melting_point': 5.5}

Exercise

Consider the following block of code which defines a list of molecules and their molecular weights.

# Define molecules
molecules = ["methane", "ethane", "benzene", "propane", "toluene", "butane", "ethylene"]
weights = [16.04, 30.07, 78.11, 44.1, 91.14, 58.12, 28.05]

Convert this code to use a dictionary (molecular_weights) where the keys are molecule names, and the values are molecular weights.

Answer

molecular_weight = {}
alkanes = {}

for i in range(len(molecules)):
    molecular_weight[molecules[i]] = weights[i]

print(molecular_weight)
{'methane': 16.04, 'ethane': 30.07, 'benzene': 78.11, 'propane': 44.1, 'toluene': 91.14, 'butane': 58.12, 'ethylene': 28.05}

We can also access all of the keys or values of the dictionary:

benzene_keys = list(benzene_molecule.keys())
benzene_values = list(benzene_molecule.values())

print(F'The dictionary keys are {benzene_keys}')
print(F'The diciontary values are {benzene_values}')

Looping through dictionaries

There are a few ways to iterate through dictionaries.

If we iterate through them in the same way as lists, we access the dictionary keys.

for k in benzene_molecule:
    print(k)
name
formula
molecular_weight
melting_point

We can iterate through key, value pairs by adding .items() to the end of the dictionary.

for k,v in molecule.items():
    print(k,v)

Check your understanding

Using the dictionary from the previous exercise (molecular_weight), create a second dictionary (alkanes) which only has molecular weights for alkanes.

Answer

alkanes = {}

for k,v in molecular_weight.items():
   if 'ane' in k:
        alkanes[k] = v

print(alkanes)
{'methane': 16.04, 'ethane': 30.07, 'propane': 44.1, 'butane': 58.12}

Copying variables

Imagine that you wanted to do some kind of mathematical manipulation on your new dictionary alkanes - maybe you wanted to convert it to kilograms per mol.

You write the following code to do this operation, starting with making a copy of the alkanes dictionary.

alkanes_kilograms = alkanes

for k,v in alkanes_kilograms.items():
    alkanes_kilograms[k] = v / 1000

print(alkanes_kilograms)
{'methane': 0.01604, 'ethane': 0.03007, 'propane': 0.0441, 'butane': 0.05812}

Great! It looks like it behaved exactly the way we wanted. However, our calculation had some unexpected consequences. Let’s check out our original dictionary.

print(alkanes)
{'methane': 0.01604, 'ethane': 0.03007, 'propane': 0.0441, 'butane': 0.05812}

Modifying our ‘copy’ of the dictionary has actually modified the original dictionary. That’s not what we wanted at all!

In Python, when you set variables equal to other mutable variables like lists or dictionaries, the original is modified.

The solution is to use a copy function. On a list or dictionary, we can do this using a .copy function.

First, let’s revert our dictionary back to the original.

for k,v in alkanes_kilograms.items():
    alkanes_kilograms[k] = v * 1000

print(alkanes_kilograms)
print(alkanes)
{'methane': 16.04, 'ethane': 30.07, 'propane': 44.1, 'butane': 58.12}
{'methane': 16.04, 'ethane': 30.07, 'propane': 44.1, 'butane': 58.12}

To make a true copy of our dictionary, use the .copy function.

alkanes_kilograms = alkanes.copy()

for k,v in alkanes_kilograms.items():
    alkanes_kilograms[k] = v * 1000

print(alkanes_kilograms)
print(alkanes)
{'methane': 16040.0, 'ethane': 30070.0, 'propane': 44100.0, 'butane': 58120.0}
{'methane': 16.04, 'ethane': 30.07, 'propane': 44.1, 'butane': 58.12}

Exercise

Try each of the following commands. When do you need to make copies of variables?

# What happens to `b` if you modify `a`?
a = [1, 2, 3]
b = a
# What happens to `c` if you modify `a`?
a = [1, 2, 3]
c = [[1, 2, 3], [4, 5, 6]]
c[0][:] = a

Answer

For the first code block, changing a also changes b.

For the second code block, changing a does not change c.

Key Points

  • Numeric data types include integers (int), floating point numbers (float), and complex numbers.

  • Text is represented as a string (str)

  • You can use lists, tuples, or dicitionaries to group data together

  • Tuples and lists are similar, but tuples cannot be modified after they are created.

  • Python dictonaries use key-value pairs to group data together