{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Python\n", "\n", "```{admonition} Overview\n", ":class: overview\n", "\n", "Questions:\n", "\n", "* What is the Python programming language, and what is it used for?\n", "\n", "* What are the advantages and disadvantages of using Python?\n", "\n", "* What is a Jupyter notebook?\n", "\n", "Objectives:\n", "\n", "* Describe the Python programming language and its uses.\n", "\n", "* Learn the basics of the Jupyter notebook.\n", "\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is Python and why use it?\n", "\n", "All of the software you use on a regular basis is created through the use of programming languages. \n", "Programming languages allow us to write instructions to a computer. \n", "There are many different programming languages, each with their own strengths, weaknesses, and uses. \n", "Some popular programming languages you might hear about are Javascript (used on the web - any website with interactive content likely uses javascript), Python (scientific programming and many other applications), C++ (high performance applications), SQL (databases), and many more.\n", "\n", "Python is a computer programming language that has become ubiquitous in scientific programming. \n", "The Python programming language was first introduced in the year 1991, and has grown to be one of the most popular programming languages for both scientists and non-scientists. \n", "According to the [2022 Stack Overflow Developer Survey](https://survey.stackoverflow.co/2022/#most-popular-technologies-language-prof), Python is the fourth most popular programming language.\n", "Compared to other programming languages, Python is considered more intutitive to start learning and is also extremely versatile. \n", "Notably, in contrast to some other languages commonly used in scientific programming, Python is **free and open-source**. \n", "This means anyone can download, install, and use Python.\n", "Python can be used to build web applications, interact with databases, and and to analyze data.\n", "\n", "\n", "## Python in Science\n", "\n", "Python is used in many different scientific fields, including chemistry, physics, biology, and astronomy.\n", "Python is used in scientific programming for a variety of reasons, including:\n", "\n", "* Python is free and open-source.\n", "* Python is easy to learn (compared to compiled programming languages like C++).\n", "* Python has a large community of users and developers.\n", "\n", "The Scientific Python ecosystem is a set of packages commonly used for scientific applications. \n", "These packages are the foundation of scientific Python programming across a range of disciplines and include\n", "NumPy, SciPy, Matplotlib, and Pandas.\n", "In our workshop today, we will explore the basics of these foundational libraries.\n", "\n", "In chemistry specifically, there are a number of specialized libraries used for processing or analyzing chemical data. \n", "For example, there are many libraries for reading and writing chemical file formats, such as the [Open Babel](http://openbabel.org/wiki/Main_Page) library. There are also libraries for cheminformatics, like RDKit, and libraries for quantum chemistry, like Psi4.\n", "For experimental chemists, there are libraries for [working with and analyzing NMR spectra](https://github.com/jjhelmus/nmrglue).\n", "\n", "## Getting Started\n", "\n", "Our initial lessons will run python interactively through a Python interpreter. \n", "We will use an environment called a Jupyter notebook. \n", "The Jupyter notebook is an environment in your browser that can be used to write an execute Python code interactively.\n", "You can view a Jupyter notebook using your browser or in some specialized text editors.\n", "\n", "Jupyter notebooks are made up of cells. Cells can either be markdown (text) or code cells.\n", "This cell is a Markdown Cell.\n", "Code cells have executable Python code in them.\n", "To change the type of a cell you can use the drop down option in the top window.\n", "To run code in a cell, click inside of the cell and press `Shift+Enter`.\n", "If the code executes successfully, the next cell will become the active cell.\n", "\n", "Markdown is a text markup language that Jupyter formats into nice looking text.\n", "In markdown, for example, a first level heading is denoted by\n", "\n", "```\n", "# Heading\n", "```\n", "\n", "Double click inside this cell to see what the markdown cell looks like!\n", "Try adding some subheadings (`##`) or bullet points yourself!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Our First Python Code\n", "\n", "Any Python interpreter can work just like a calculator. \n", "This is not very useful. \n", "Press the Shift Key and the Enter Key (`Shift + Enter`) at the same time to run (also called \"execute\") the code in a cell.\n", "The following cell contains an expression to calculate `3 + 7`" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "3 + 7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Assigning variables\n", "\n", "Python can perform calculations for us. \n", "To save this value, or other values, we assign them to a variable for later use. \n", "Variable **assignment** is the technical term for doing this. \n", "If we do not assign an expression to a variable, we will not be able to use its value later.\n", "\n", "The syntax for assigning variables is the following:\n", "\n", "```python\n", "variable_name = varaible_value\n", "```\n", "\n", "Let’s see this in action with a calculation.\n", "Let's define some variables for our calculation.\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [] }, "outputs": [], "source": [ "deltaH = -541.5 # kJ/mole\n", "deltaS = 10.4 # kJ/(mole K)\n", "temp = 298 # Kelvin" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice several things about this code. \n", "The text after `#` are comments. \n", "The computer does not do anything with these comments. \n", "They have been used here to remind the user what units each of their values are in. \n", "Comments are also often used to explain what the code is doing or leave information for future people who might use the code.\n", "\n", "When choosing variable names, you should choose informative names so that someone reading your code can tell what they represent. Naming a variable temp or temperature is much more informative than naming that variable t.\n", "\n", "We can now access any of the variables from other cells. Let’s calculate something using our defined variables.\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-3640.7000000000003\n" ] } ], "source": [ "deltaG = deltaH - temp * deltaS\n", "print(deltaG)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## JupyterLab: The `tab` key\n", "\n", "Jupyter notebooks allow autocomplete using the tab key.\n", "To test this out, try typing `pri` in the cell below, then press `tab` on your keyboard twice.\n", "You should see that the word `print` is completed for you.\n", "\n", "After you have the word `print` add a parenthesis, then start typing `delt` and press `tab` twice again.\n", "You will see a list of potential variables or functions you might want to use. \n", "You can use the arrows + the Enter key to select the variable you would like to use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using Functions\n", "\n", "When we use `print`, we are using a `function`. \n", "Functions are reusable pieces of code that perform certain tasks. \n", "Examples include printing, opening files, performing a calculations, and many others.\n", "Functions have a name that is followed by parenthesis containing the function inputs separated by commas (also called *arguments*).\n", "\n", "```python\n", "function_name(argument1, argument2)\n", "```\n", "\n", "In the previous code block, we introduced the `print` function. Often, we will use the print function just to make sure our code is working correctly.\n", "\n", "## Overwriting Variables\n", "\n", "Note that if you do not specify a new name for a variable, then it doesn’t automatically change the value of the variable. For example if we typed" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(deltaG)\n", "deltaG * 1000\n", "print(deltaG)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nothing happened to the value of deltaG. If we wanted to change the value of deltaG we would have to re-save the variable using the same name to overwrite the existing value." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(deltaG)\n", "deltaG = deltaG * 1000\n", "print(deltaG)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are situations where it is reasonable to overwrite a variable with a new value, but you should always think carefully about this. Usually it is a better practice to give the variable a new name and leave the existing variable as is." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(deltaG)\n", "deltaG_joules = deltaG * 1000\n", "print(deltaG)\n", "print(deltaG_joules)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Types\n", "\n", "Each variable is some particular type of data. \n", "The most common types of data are strings (str), \n", "integers (int), and floating point numbers (float).\n", "You can identify the data type of any variable with the function `type(variable_name)`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "type(deltaG)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can change the data type of a variable like this. This is called casting." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "deltaG_string = str(deltaG)\n", "type(deltaG_string)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We could have created a variable as a string originally by surrounding the value in quotes `\"\"`. It doesn't matter if you use single or double quotes, the first quote just has to match the closing quote." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "string_variable = \"This is a string\"\n", "print(type(string_variable))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lists\n", "\n", "Another common data structure in Python is the list. Lists can be used to group several values or variables together.\n", "For today's workshop, we will be using a more complicated data structure that is part of the pandas library.\n", "However, they share many attributes with lists.\n", "Lists are a default data type in Python - meaning they are always available no matter what libraries you are using.\n", "\n", "You can visualize a list using the illustration below.\n", "In our picture, our list has 6 elements. \n", "**Notably for Python, when counting elements in a list, you start at 0.**\n", "\n", "![list_index](./images/list_index.png)\n", "\n", "Lists are created by adding square brackets around a value or variable.\n", "List elements are separated by commas. \n", "Python has several built in functions which can be used on lists. \n", "The built-in function `len` can be used to determine the length of a list." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The length of this list is 4\n", "The maximum of the energy values is 42.1\n" ] } ], "source": [ "# This is a list\n", "energy_kcal = [-13.4, -2.7, 5.4, 42.1]\n", "\n", "# I can determine its length\n", "energy_length = len(energy_kcal)\n", "\n", "# I can determine its max\n", "max_energy = max(energy_kcal)\n", "\n", "# print calculated values\n", "print('The length of this list is', energy_length)\n", "print('The maximum of the energy values is', max_energy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To operate on a particular element of the list, you use the list name and then put in brackets which element of the list you want. In Python counting starts at zero. So the first element of the list is `list[0]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Print the first element of the list\n", "print(energy_kcal[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can use an element of a list as a variable in a calculation." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "# Convert the second list element to kilojoules.\n", "energy_kilojoules = energy_kcal[1] * 4.184\n", "print(energy_kilojoules)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Slices\n", "\n", "Sometimes you will want to make a new list that is a subset of an existing list. For example, we might want to make a new list that is just the first few elements of our previous list. This is called a slice. The general syntax is\n", "\n", "```python\n", "new_list = list_name[start:end]\n", "```\n", "\n", "When taking a slice, it is very important to remember how counting works in python. Remember that counting starts at zero so the first element of a list is `list_name[0]`. When you specify the last element for the slice, it goes *up to but not including* that element of the list. So a slice like\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "short_list = energy_kcal[0:2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "includes energy_kcal[0] and energy_kcal[1] but not energy_kcal[2].\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "print(short_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you do not include a start index, the slice automatically starts at list_name[0]. If you do not include an end index, the slice automatically goes to the end of the list.\n", "\n", "\n", "
Check Your Understanding
\n", "\n", "What does the following code block print?\n", "```python\n", "slice1 = energy_kcal[1:]\n", "slice2 = energy_kcal[:3]\n", "print('slice1 is', slice1)\n", "print('slice2 is', slice2)\n", "```\n", "\n", "See if you can predict the output, then check yourself in your notebook.\n", "