# Python I

In this class, we will watch the first of four lectures by Dr. Mike Gelbart, option co-director of the UBC-Vancouver MDS program.

<div class="youtube">
<iframe class="responsive-iframe" height="350px" width="622px" src="https://www.youtube-nocookie.com/embed/yBAYduexjuA" frameborder="0" allow="accelerometer; autoplay="0"; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>

### Attribution

- The original version of these Python lectures were by [Patrick Walls](https://www.math.ubc.ca/~pwalls/).
- These lectures were delivered by [Mike Gelbart](https://mikegelbart.com) and are [available publicly here](https://www.youtube.com/watch?v=yBAYduexjuA).

## About this course (5 min)

### High-level overview:

- The MDS program has a programming prerequisite.
- Therefore, this course does not start from "no programming knowledge".
  - You should know what an `if` statement is.
  - You should know what a `for` loop is.
  - You should know what a function is.
- However, not all of you have used Python/R.
- So, this course is about _Python-specific_ and _R-specific_ syntax/knowledge.
- We will cover things like loops, but just the syntax, not the concept of a loop.
- Weeks 1&2: Python, lectures by Mike Gelbart
- Weeks 3&4: R, lectures by Tiffany Timbers

## Lecture Outline:

- Basic datatypes (20 min)
- Lists and tuples (20 min)
- Break (5 min)
- String methods (5 min)
- Dictionaries (10 min)
- Conditionals (10 min)

## Basic datatypes (20 min)

- A **value** is a piece of data that a computer program works with such as a number or text. 
- There are different **types** of values: `42` is an integer and `"Hello!"` is a string. 
- A **variable** is a name that refers to a value. 
  - In mathematics and statistics, we usually use variables names like $x$ and $y$. 
  - In Python, we can use any word as a variable name (as long as it starts with a letter and is not a [reserved word](https://docs.python.org/3.3/reference/lexical_analysis.html#keywords) in Python such as `for`, `while`, `class`, `lambda`, etc.). 
- And we use the **assignment operator** `=` to assign a value to a variable.

See the [Python 3 documentation](https://docs.python.org/3/library/stdtypes.html) for a summary of the standard built-in Python datatypes. See [Think Python (Chapter 2)](http://greenteapress.com/thinkpython/html/thinkpython003.html) for a discussion of variables, expressions and statements in Python.

#### Common built-in Python data types

| English name | Type name | Description | Example |
| :--- | :--- | :--- | :--- |
| integer | `int` | positive/negative whole numbers | `42` |
| floating point number | `float` | real number in decimal form | `3.14159` |
| boolean | `bool` | true or false | `True` |
| string | `str` | text | `"I Can Has Cheezburger?"` |
| list | `list` | a collection of objects - mutable & ordered | `['Ali','Xinyi','Miriam']` |
| tuple | `tuple` | a collection of objects - immutable & ordered | `('Thursday',6,9,2018)` |
| dictionary | `dict` | mapping of key-value pairs | `{'name':'DSCI','code':511,'credits':2}` |
| none | `NoneType` | represents no value | `None` |

#### Numeric Types

In [1]:
x = 42

In [2]:
type(x)

int

In [3]:
print(x)

42


In [4]:
x # in Jupyter/IPython we don't need to explicitly print for the last line of a cell

42

In [5]:
pi = 3.14159

In [6]:
print(pi)

3.14159


In [7]:
type(pi)

float

In [8]:
Î» = 2

#### Arithmetic Operators

The syntax for the arithmetic operators are:

| Operator | Description |
| :---: | :---: |
| `+` | addition |
| `-` | subtraction |
| `*` | multiplication |
| `/` | division |
| `**` | exponentiation |
| `//` | integer division |
| `%`  | modulo |

Let's apply these operators to numeric types and observe the results.

In [9]:
1 + 2 + 3 + 4 + 5

15

In [10]:
0.1 + 0.2

0.30000000000000004

```{tip}
From Firas: This is floating point arithmetic. For an explanation of what's going on, [see this tutorial](https://docs.python.org/3/tutorial/floatingpoint.html).
```

In [11]:
2 * 3.14159

6.28318

In [12]:
2**10

1024

In [13]:
type(2**10)

int

In [14]:
2.0**10

1024.0

In [15]:
int_2 = 2

In [16]:
float_2 = 2.0

In [17]:
float_2_again = 2.

In [18]:
101 / 2

50.5

In [19]:
101 // 2 # "integer division" - always rounds down

50

In [20]:
101 % 2 # "101 mod 2", or the remainder when 101 is divided by 2

1

#### None

- `NoneType` is its own type in Python.
- It only has one possible value, `None`

In [21]:
x = None

In [22]:
print(x)

None


In [23]:
type(x)

NoneType

You may have seen similar things in other languages, like `null` in Java, etc.

#### Strings

- Text is stored as a type called a string. 
- We think of a string as a sequence of characters. 
- We write strings as characters enclosed with either:
  - single quotes, e.g., `'Hello'` 
  - double quotes, e.g., `"Goodbye"`
  - triple single quotes, e.g., `'''Yesterday'''`
  - triple double quotes, e.g., `"""Tomorrow"""`

In [24]:
my_name = "Mike Gelbart"

In [25]:
print(my_name)

Mike Gelbart


In [29]:
type(my_name)

str

In [31]:
course = 'DSCI 511'

In [32]:
print(course)

DSCI 511


In [33]:
type(course)

str

If the string contains a quotation or apostrophe, we can use double quotes or triple quotes to define the string.

In [34]:
sentence = "It's a rainy day."

In [35]:
print(sentence)

It's a rainy day.


In [36]:
type(sentence)

str

In [38]:
saying = '''They say: 
"It's a rainy day!"'''

In [39]:
print(saying)

They say: 
"It's a rainy day!"


#### Boolean

- The Boolean (`bool`) type has two values: `True` and `False`. 

In [40]:
the_truth = True

In [41]:
print(the_truth)

True


In [42]:
type(the_truth)

bool

In [43]:
lies = False

In [44]:
print(lies)

False


In [45]:
type(lies)

bool

#### Comparison Operators

Compare objects using comparison operators. The result is a Boolean value.

| Operator | Description |
| :---: | :--- |
| `x == y ` | is `x` equal to `y`? |
| `x != y` | is `x` not equal to `y`? |
| `x > y` | is `x` greater than `y`? |
| `x >= y` | is `x` greater than or equal to `y`? |
| `x < y` | is `x` less than `y`? |
| `x <= y` | is `x` less than or equal to `y`? |
| `x is y` | is `x` the same object as `y`? |

In [46]:
2 < 3

True

In [47]:
"Data Science" != "Deep Learning"

True

In [48]:
2 == "2"

False

In [49]:
2 == 2.0

True

Note: we will discuss `is` next week.

Operators on Boolean values.

| Operator | Description |
| :---: | :--- |
|`x and y`| are `x` and `y` both true? |
|`x or y` | is at least one of `x` and `y` true? |
| `not x` | is `x` false? | 

In [51]:
True and True

True

In [50]:
True and False

False

In [52]:
False or False

False

In [53]:
("Python 2" != "Python 3") and (2 <= 3)

True

In [54]:
not True

False

In [55]:
not not True

True

#### Casting

- Sometimes (but rarely) we need to explicitly **cast** a value from one type to another.
- Python tries to do something reasonable, or throws an error if it has no ideas.

In [56]:
x = int(5.0)
x

5

In [57]:
type(x)

int

In [58]:
x = str(5.0)
x

'5.0'

In [59]:
type(x)

str

In [60]:
str(5.0) == 5.0

False

In [61]:
list(5.0) # there is no reasonable thing to do here

TypeError: 'float' object is not iterable

In [62]:
int(5.3)

5

## Lists and Tuples (20 min)

- Lists and tuples allow us to store multiple things ("elements") in a single object.
- The elements are _ordered_.

In [63]:
my_list = [1, 2, "THREE", 4, 0.5]

In [65]:
print(my_list)

[1, 2, 'THREE', 4, 0.5]


In [66]:
type(my_list)

list

You can get the length of the list with `len`:

In [67]:
len(my_list)

5

In [68]:
today = (1, 2, "THREE", 4, 0.5)

In [69]:
print(today)

(1, 2, 'THREE', 4, 0.5)


In [70]:
type(today)

tuple

In [71]:
len(today)

5

#### Indexing and Slicing Sequences

- We can access values inside a list, tuple, or string using the backet syntax. 
- Python uses zero-based indexing, which means the first element of the list is in position 0, not position 1. 
- Sadly, R uses one-based indexing, so get ready to be confused.

In [72]:
my_list

[1, 2, 'THREE', 4, 0.5]

In [73]:
my_list[0]

1

In [74]:
my_list[4]

0.5

In [75]:
my_list[5]

IndexError: list index out of range

In [77]:
today[4]

0.5

We use negative indices to count backwards from the end of the list.

In [80]:
my_list

[1, 2, 'THREE', 4, 0.5]

In [81]:
my_list[-1]

0.5

We use the colon `:` to access a subsequence. This is called "slicing".

In [84]:
my_list[1:4]

[2, 'THREE', 4]

- Above: note that the start is inclusive and the end is exclusive.
- So `my_list[1:3]` fetches elements 1 and 2, but not 3.
- In other words, it gets the 2nd and 3rd elements in the list.

We can omit the start or end:

In [85]:
my_list[:3]

[1, 2, 'THREE']

In [86]:
my_list[3:]

[4, 0.5]

In [87]:
my_list[:] # *almost* same as my_list - more details next week

[1, 2, 'THREE', 4, 0.5]

Strings behave the same as lists and tuples when it comes to indexing and slicing.

In [88]:
alphabet = "abcdefghijklmnopqrstuvwxyz"

In [89]:
alphabet[0]

'a'

In [90]:
alphabet[-1]

'z'

In [91]:
alphabet[-3]

'x'

In [92]:
alphabet[:5]

'abcde'

In [93]:
alphabet[12:20]

'mnopqrst'

#### List Methods

- A list is an object and it has methods for interacting with its data. 
- For example, `list.append(item)` appends an item to the end of the list. 
- See the documentation for more [list methods](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists).

In [94]:
primes = [2,3,5,7,11]
primes

[2, 3, 5, 7, 11]

In [95]:
len(primes)

5

In [114]:
primes.append(13)

In [115]:
primes

[2,
 3,
 5,
 7,
 11,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13,
 13]

In [116]:
len(primes)

22

In [117]:
max(primes)

13

In [118]:
min(primes)

2

In [119]:
sum(primes)

249

In [120]:
[1,2,3] + ["Hello", 7]

[1, 2, 3, 'Hello', 7]

#### Sets

- Another built-in Python data type is the `set`, which stores an _un-ordered_ list of _unique_ items.
- More on sets in DSCI 512.

In [121]:
s = {2,3,5,11}
s

{2, 3, 5, 11}

In [122]:
{1,2,3} == {3,2,1}

True

In [123]:
[1,2,3] == [3,2,1]

False

In [124]:
s.add(2) # does nothing
s

{2, 3, 5, 11}

In [125]:
s[0]

TypeError: 'set' object is not subscriptable

Above: throws an error because elements are not ordered.

#### Mutable vs. Immutable Types

- Strings and tuples are immutable types which means they cannot be modified. 
- Lists are mutable and we can assign new values for its various entries. 
- This is the main difference between lists and tuples.

In [126]:
names_list = ["Indiana","Fang","Linsey"]
names_list

['Indiana', 'Fang', 'Linsey']

In [128]:
names_list[0] = "Cool guy"
names_list

['Cool guy', 'Fang', 'Linsey']

In [129]:
names_tuple = ("Indiana","Fang","Linsey")
names_tuple

('Indiana', 'Fang', 'Linsey')

In [130]:
names_tuple[0] = "Not cool guy"

TypeError: 'tuple' object does not support item assignment

Same goes for strings. Once defined we cannot modifiy the characters of the string.

In [131]:
my_name = "Mike"

In [132]:
my_name[-1] = 'q'

TypeError: 'str' object does not support item assignment

In [137]:
x = ([1,2,3],5)

In [138]:
x[1] = 7

TypeError: 'tuple' object does not support item assignment

In [139]:
x

([1, 2, 3], 5)

In [140]:
x[0][1] = 4

In [141]:
x

([1, 4, 3], 5)

## Break (5 min)

## String Methods (5 min)

- There are various useful string methods in Python.
- MDS-CL students will soon be the experts we can go to for help!

In [142]:
all_caps = "HOW ARE YOU TODAY?"
print(all_caps)

HOW ARE YOU TODAY?


In [145]:
new_str = all_caps.lower()
new_str

'how are you today?'

Note that the method lower doesn't change the original string but rather returns a new one.


In [146]:
all_caps

'HOW ARE YOU TODAY?'

There are *many* string methods. Check out the [documentation](https://docs.python.org/3/library/stdtypes.html#string-methods).

In [147]:
all_caps.split()

['HOW', 'ARE', 'YOU', 'TODAY?']

In [148]:
all_caps.count("O")

3

One can explicitly cast a string to a list:

In [151]:
caps_list = list(all_caps)
caps_list

['H',
 'O',
 'W',
 ' ',
 'A',
 'R',
 'E',
 ' ',
 'Y',
 'O',
 'U',
 ' ',
 'T',
 'O',
 'D',
 'A',
 'Y',
 '?']

In [152]:
len(all_caps)

18

In [153]:
len(caps_list)

18

#### String formatting

- Python has ways of creating strings by "filling in the blanks" and formatting them nicely. 
- There are a few ways of doing this. See [here](https://realpython.com/python-string-formatting/) and [here](https://stackoverflow.com/questions/5082452/string-formatting-vs-format) for some discussion.

Old formatting style (borrowed from the C programming language):

In [154]:
template = "Hello, my name is %s. I am %.2f years old."

In [155]:
template % ("Newborn Baby", 4/12)

'Hello, my name is Newborn Baby. I am 0.33 years old.'

New formatting style (see [documentation](https://docs.python.org/3/library/stdtypes.html#str.format)):

In [156]:
template_new = "Hello, my name is {}. I am {:.2f} years old."

In [157]:
template_new.format('Newborn Baby', 4/12)

'Hello, my name is Newborn Baby. I am 0.33 years old.'

Newer formatting style (see [here](https://realpython.com/python-f-strings/#f-strings-a-new-and-improved-way-to-format-strings-in-python)) - note the `f` before the start of the string:

In [3]:
name = "Newborn Baby"
age = 4/12
template_new = f'Hello, my name is {name}. I am {age:.2f} years old.'
template_new

'Hello, my name is Newborn Baby. I am 0.33 years old.'

## Dictionaries (10 min)

A dictionary is a mapping between key-values pairs.

In [158]:
house = {'bedrooms': 3, 'bathrooms': 2, 'city': 'Vancouver', 'price': 2499999, 'date_sold': (1,3,2015)}

condo = {'bedrooms' : 2, 
         'bathrooms': 1, 
         'city'     : 'Burnaby', 
         'price'    : 699999, 
         'date_sold': (27,8,2011)
        }

We can access a specific field of a dictionary with square brackets:

In [159]:
house['price']

2499999

In [160]:
condo['city']

'Burnaby'

We can also edit dictionaries (they are mutable):

In [162]:
condo['price'] = 5 # price already in the dict
condo

{'bedrooms': 2,
 'bathrooms': 1,
 'city': 'Burnaby',
 'price': 5,
 'date_sold': (27, 8, 2011)}

In [163]:
condo['flooring'] = "wood"

In [164]:
condo

{'bedrooms': 2,
 'bathrooms': 1,
 'city': 'Burnaby',
 'price': 5,
 'date_sold': (27, 8, 2011),
 'flooring': 'wood'}

We can delete fields entirely (though I rarely use this):

In [165]:
del condo["city"]

In [166]:
condo

{'bedrooms': 2,
 'bathrooms': 1,
 'price': 5,
 'date_sold': (27, 8, 2011),
 'flooring': 'wood'}

In [167]:
condo[5] = 443345

In [168]:
condo

{'bedrooms': 2,
 'bathrooms': 1,
 'price': 5,
 'date_sold': (27, 8, 2011),
 'flooring': 'wood',
 5: 443345}

In [169]:
condo[(1,2,3)] = 777
condo

{'bedrooms': 2,
 'bathrooms': 1,
 'price': 5,
 'date_sold': (27, 8, 2011),
 'flooring': 'wood',
 5: 443345,
 (1, 2, 3): 777}

In [170]:
condo["nothere"]

KeyError: 'nothere'

A sometimes useful trick about default values:

In [171]:
condo["bedrooms"]

2

is shorthand for

In [172]:
condo.get("bedrooms")

2

With this syntax you can also use default values:

In [173]:
condo.get("bedrooms", "unknown")

2

In [174]:
condo.get("fireplaces", "unknown")

'unknown'

- A common operation is finding the maximum dictionary key by value.
- There are a few ways to do this, see [this StackOverflow page](https://stackoverflow.com/questions/268272/getting-key-with-maximum-value-in-dictionary).
- One way of doing it:

In [175]:
max(word_lengths, key=word_lengths.get)

NameError: name 'word_lengths' is not defined

We saw `word_lengths.get` above - it is saying that we should call this function on each key of the dict to decide how to sort.

#### Empties

In [176]:
lst = list() # empty list
lst

[]

In [177]:
lst = [] # empty list
lst

[]

In [178]:
tup = tuple() # empty tuple
tup

()

In [179]:
tup = () # empty tuple
tup

()

In [180]:
dic = dict() # empty dict
dic

{}

In [181]:
dic = {} # empty dict
dic

{}

In [182]:
st = set() # emtpy set
st

set()

In [183]:
st = {} # NOT an empty set!
type(st)

dict

In [184]:
st = {1}
type(st)

set

## Conditionals (10 min)

- [Conditional statements](https://docs.python.org/3/tutorial/controlflow.html) allow us to write programs where only certain blocks of code are executed depending on the state of the program. 
- Let's look at some examples and take note of the keywords, syntax and indentation. 
- Check out the [Python documentation](https://docs.python.org/3/tutorial/controlflow.html) and [Think Python (Chapter 5)](http://greenteapress.com/thinkpython/html/thinkpython006.html) for more information about conditional execution.

In [189]:
name = input("What's your name?")

if name.lower() == 'mike':
    print("That's my name too!")
elif name.lower() == 'santa':
    print("That's a funny name.")
else:
    print("Hello {}! That's a cool name.".format(name))

    print('Nice to meet you!')

What's your name?mike
That's my name too!


In [190]:
bool(None)

False

The main points to notice:

* Use keywords `if`, `elif` and `else`
* The colon `:` ends each conditional expression
* Indentation (by 4 empty space) defines code blocks
* In an `if` statement, the first block whose conditional statement returns `True` is executed and the program exits the `if` block
* `if` statements don't necessarily need `elif` or `else`
* `elif` lets us check several conditions
* `else` lets us evaluate a default block if all other conditions are `False`
* the end of the entire `if` statement is where the indentation returns to the same level as the first `if` keyword

If statements can also be **nested** inside of one another:

In [192]:
name = input("What's your name?")

if name.lower() == 'mike':
    print("That's my name too!")
elif name.lower() == 'santa':
    print("That's a funny name.")
else:
    print("Hello {0}! That's a cool name.".format(name))
    if name.lower().startswith("super"):
        print("Do you have superpowers?")

print('Nice to meet you!')

What's your name?supersam
Hello supersam! That's a cool name.
Do you have superpowers?
Nice to meet you!


#### Inline if/else

In [193]:
words = ["the", "list", "of", "words"]

x = "long list" if len(words) > 10 else "short list"
x

'short list'

In [194]:
if len(words) > 10:
    x = "long list"
else:
    x = "short list"

In [195]:
x

'short list'

#### (optional) short-circruiting

In [None]:
BLAH # not defined

In [None]:
True or BLAH

In [None]:
True and BLAH

In [None]:
False and BLAH