from IPython.display import IFrame
from IPython.display import Markdown
# Additional styling ; should be moved into helpers
from IPython.core.display import display, HTML
HTML('<style>{}</style>'.format(open('rise.css').read()))

Class 7B: Programming in Python II¶

We will begin soon!! Until then, feel free to use the chat to socialize, and enjoy the music!

October 20, 2021
Firas Moosvi

Announcements¶

  1. Welcome Back!

  1. Grades and feedback for Labs 1-3 and feedback for LL1 - LL4 is now released (grades to come much later); if any major issues, submit regrade request on Gradescope.

  1. Milestone 3 due next week!

  1. Test 2 window will start this Friday at 6 PM!

  1. Lab 5 is due this week.

  1. Reminder: my Student Hours are after class on M,W,F and the Project TA Student hours are Wednesdays from 1-2 PM

Python II¶

In this class, we go through a notebook by a former colleague, Dr. Mike Gelbart, option co-director of the UBC-Vancouver MDS program.

If you prefer, you can also watch his recording of the same material.

Class Outline¶

  • Functions Intro

  • Docstrings

  • Unit tests, corner cases

  • Multiple return values

Attribution¶

Functions intro¶

  • Define a function to re-use a block of code with different input parameters, also known as arguments.

  • For example, define a function called square which takes one input parameter n and returns the square n**2.

def square(n):
    n_squared = n**2
    return n_squared
var = 5

square(n=var)
25
square(100)
10000
square(12345)
152399025
  • Begins with def keyword, function name, input parameters and then colon (:)

  • Function block defined by indentation

  • Output or “return” value of the function is given by the return keyword

Null return type¶

If you do not specify a return value, the function returns None when it terminates:

def f(x):
    x + 1 # no return!
    if x == 999:
        return
print(f(0))
None

DRY principle, designing good functions¶

  • DRY: Don’t Repeat Yourself

  • See Wikipedia article

  • Consider the task of, for each element of a list, turning it into a palindrome

    • e.g. “mike” –> “mikeekim”

names = ["milad", "rodolfo", "tiffany","Firas"]
name = "mike"
name[::-1] 
'ekim'
names_backwards = list()

names_backwards.append(names[0] + names[0][::-1])
names_backwards.append(names[1] + names[1][::-1])
names_backwards.append(names[2] + names[2][::-1])
names_backwards.append(names[3] + names[3][::-1])

names_backwards 
['miladdalim', 'rodolfooflodor', 'tiffanyynaffit', 'FirassariF']
  • Above: this is gross, terrible, yucky code

    1. It only works for a list with 3 elements

    2. It only works for a list named names

    3. If we want to change its functionality, we need to change 3 similar lines of code (Don’t Repeat Yourself!!)

    4. It is hard to understand what it does just by looking at it

names_backwards = list()

for name in names:
    names_backwards.append(name + name[::-1])
    
names_backwards
['miladdalim', 'rodolfooflodor', 'tiffanyynaffit', 'FirassariF']

Above: this is slightly better. We have solved problems (1) and (3).

def make_palindromes(names):
    names_backwards = list()
    
    for name in names:
        names_backwards.append(name + name[::-1])
    
    return names_backwards

make_palindromes(names)
['miladdalim', 'rodolfooflodor', 'tiffanyynaffit', 'FirassariF']
  • Above: this is even better. We have now also solved problem (2), because you can call the function with any list, not just names.

  • For example, what if we had multiple lists:

names1 = ["milad", "rodolfo", "tiffany"]
names2 = ["Trudeau", "Scheer", "Singh", "Blanchet", "May"]
names3 = ["apple", "orange", "banana"]
names_backwards_1 = list()

for name in names1:
    names_backwards_1.append(name + name[::-1])
    
names_backwards_1 
['miladdalim', 'rodolfooflodor', 'tiffanyynaffit']
names_backwards_2 = list()

for name in names2:
    names_backwards_2.append(name + name[ ::-1])
    
names_backwards_2
['TrudeauuaedurT', 'ScheerreehcS', 'SinghhgniS', 'BlanchettehcnalB', 'MayyaM']
names_backwards_3 = list()

for name in names3:
    names_backwards_3.append(name + name[::-1])
    
names_backwards_3
['appleelppa', 'orangeegnaro', 'bananaananab']

Above: this is very bad also (and imagine if it was 20 lines of code instead of 2). This was problem (2). Our function makes it much better:

make_palindromes(names1)
['miladdalim', 'rodolfooflodor', 'tiffanyynaffit']
make_palindromes(names2)
['TrudeauuaedurT', 'ScheerreehcS', 'SinghhgniS', 'BlanchettehcnalB', 'MayyaM']
make_palindromes(names3)
['appleelppa', 'orangeegnaro', 'bananaananab']
  • You could get even more fancy, and put the lists of names into a list (so you have a list of lists).

  • Then you could loop over the list and call the function each time:

for list_of_names in [names1, names2, names3]:
    print(make_palindromes(list_of_names))
['miladdalim', 'rodolfooflodor', 'tiffanyynaffit']
['TrudeauuaedurT', 'ScheerreehcS', 'SinghhgniS', 'BlanchettehcnalB', 'MayyaM']
['appleelppa', 'orangeegnaro', 'bananaananab']

Designing good functions¶

  • How far you go with this is sort of a matter of personal style, and how you choose to apply the DRY principle: DON’T REPEAT YOURSELF!

  • These decisions are often ambiguous. For example:

    • Should make_palindromes be a function if I’m only ever doing it once? Twice?

    • Should the loop be inside the function, or outside?

    • Or should there be TWO functions, one that loops over the other??

  • In my personal opinion, make_palindromes does a bit too much to be understandable.

  • I prefer this:

def make_palindrome(name):
    return name + name[::-1]

make_palindrome("milad")
'miladdalim'
  • From here, we want to “apply make_palindrome to every element of a list”

  • It turns out this is an extremely common desire, so Python has built-in functions.

  • One of these is map, which we’ll cover later. But for now, just a comprehension will do:

[make_palindrome(name) for name in names1]
['miladdalim', 'rodolfooflodor', 'tiffanyynaffit']

Other function design considerations:

  • Should we print output or produce plots inside or outside functions?

    • I would usually say outside, because this is a “side effect” of sorts

  • Should the function do one thing or many things?

    • This is a tough one, hard to answer in general

Optional & keyword arguments¶

  • Sometimes it is convenient to have default values for some arguments in a function.

  • Because they have default values, these arguments are optional, hence “optional arguments”

  • Example:

square()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_2026/1291548297.py in <module>
----> 1 square()

TypeError: square() missing 1 required positional argument: 'n'
def repeat_string(s, n=2):
    return s*n
repeat_string("mds",n=10)
'mdsmdsmdsmdsmdsmdsmdsmdsmdsmds'
repeat_string("mds-", 5)
'mds-mds-mds-mds-mds-'
repeat_string("mds") # do not specify `n`; it is optional
'mdsmds'

Sensible defaults:

  • Ideally, the default should be carefully chosen.

  • Here, the idea of “repeating” something makes me think of having 2 copies, so n=2 feels like a sensible default.

Syntax:

  • You can have any number of arguments and any number of optional arguments

  • All the optional arguments must come after the regular arguments

  • The regular arguments are mapped by the order they appear

  • The optional arguments can be specified out of order

def example(a, b, c="DEFAULT", d="DEFAULT"):
    print(a,b,c,d)
    
example(1,2,3,4)
1 2 3 4

Using the defaults for c and d:

example(1,2)
1 2 DEFAULT DEFAULT

Specifying c and d as keyword arguments (i.e. by name):

example(1,2,c=3,d=4)
1 2 3 4

Specifying only one of the optional arguments, by keyword:

example(1,2,c=3)
1 2 3 DEFAULT

Or the other:

example(1,2,d=4)
1 2 DEFAULT 4

Specifying all the arguments as keyword arguments, even though only c and d are optional:

example(a=1,b=2,c=3,d=4) 
1 2 3 4

Specifying c by the fact that it comes 3rd (I do not recommend this because I find it is confusing):

example(1,2,3) # not recommended
1 2 3 DEFAULT

Specifying the optional arguments by keyword, but in the wrong order (this is also somewhat confusing, but not so terrible - I am OK with it):

example(1,2,d=4,c=3) 

Specifying the non-optional arguments by keyword (I am fine with this):

example(a=1,b=2)

Specifying the non-optional arguments by keyword, but in the wrong order (not recommended, I find it confusing):

example(b=2,a=1)

Specifying keyword arguments before non-keyword arguments (this throws an error):

example(a=2,1)
  • In general, I am used to calling required arguments in order, and any optional arguments by keyword.

  • The language allows us to deviate from this, but it can be unnecessarily confusing sometimes.

Advanced stuff (optional):¶

  • You can also call/define functions with *args and **kwargs; see, e.g. here

  • Do not instantiate objects in the function definition - see here under “Mutable Default Arguments”

def example(a, b=[]): # don't do this!
    return 0
def example(a, b=None): # insted, do this
    if b is None:
        b = []
    return 0

Docstrings¶

  • We got pretty far above, but we never solved problem (4): It is hard to understand what it does just by looking at it

  • Enter the idea of function documentation (and in particular docstrings)

  • The docstring goes right after the def line.

def make_palindrome(string):
    """Turns the string into a palindrome by concatenating itself with a reversed version of itself."""
    
    return string + string[::-1]

In IPython/Jupyter, we can use ? to view the documentation string of any function in our environment.

make_palindrome?
print?

Docstring structure¶

  1. Single-line: If it’s short, then just a single line describing the function will do (as above).

  2. PEP-8 style Multi-line description + a list of arguments; see here.

  3. Scipy style: The most elaborate & informative; see here and here.

The PEP-8 style:

def make_palindrome(s):
    """
    Turns the string into a palindrome by concatenating itself 
    with a reversed version of itself.
    
    Arguments:
    s - (str) the string to turn into a palindrome
    """
    return s + s[::-1]
make_palindrome?

The scipy style:

def make_palindrome(s):
    """
    Turn a string into a palindrome.
    
    Turns the string into a palindrome by concatenating itself 
    with a reversed version of itself, so that the returned
    string is twice as long as the original.
    
    Parameters
    ----------
    s : str
        The string to turn into a palindrome.
        
    Returns
    -------
    str
        The new palindrome string. 
        
    Examples
    -------- 
    >>> make_palindrome("abc")
    "abccba"
    """
    return s + s[::-1]
make_palindrome('hello') # press shift-tab HERE to get docstring!!
'helloolleh'

Below is the general form of the scipy docstring (reproduced from the scipy/numpy docs):

def function_name(param1,param2,param3):
    """First line is a short description of the function.
    
    A paragraph describing in a bit more detail what the
    function does and what algorithms it uses and common
    use cases.
    
    Parameters
    ----------
    param1 : datatype
        A description of param1.
    param2 : datatype
        A description of param2.
    param3 : datatype
        A longer description because maybe this requires
        more explanation and we can use several lines.
    
    Returns
    -------
    datatype
        A description of the output, datatypes and behaviours.
        Describe special cases and anything the user needs to
        know to use the function.
    
    Examples
    --------
    >>> function_name(3,8,-5)
    2.0
    """

Docstrings with optional arguments¶

When specifying the parameters, we specify the defaults for optional arguments:

# PEP-8 style
def repeat_string(s, n=2):
    """
    Repeat the string s, n times.
    
    Arguments:
    s -- (str) the string
    n -- (int) the number of times (default 2)
    """
    return s*n
# scipy style
def repeat_string(s, n=2):
    """
    Repeat the string s, n times.
    
    Parameters
    ----------
    s : str 
        the string
    n : int, optional (default = 2)
        the number of times
        
    Returns
    -------
    str
        the repeated string
        
    Examples
    --------
    >>> repeat_string("Blah", 3)
    "BlahBlahBlah"
    """
    return s*n

Automatically generated documentation¶

  • By following the docstring conventions, we can automatically generate documentation using libraries like sphinx, pydoc or Doxygen.

    • For example: compare this documentation with this code.

    • Notice the similarities? The webpage was automatically generated because the authors used standard conventions for docstrings!

What makes good documentation?¶

  • What do you think about this?

################################
#
# NOT RECOMMENDED TO DO THIS!!!
#
################################

def make_palindrome(string):
    """
    Turns the string into a palindrome by concatenating itself 
    with a reversed version of itself. To do this, it uses the
    Python syntax of `[::-1]` to flip the string, and stores
    this in a variable called string_reversed. It then uses `+`
    to concatenate the two strings and return them to the caller.
    
    Arguments:
    string - (str) the string to turn into a palindrome
    
    Other variables:
    string_reversed - (str) the reversed string
    """
    
    string_reversed = string[::-1]
    return string + string_reversed



  • This is poor documentation! More is not necessarily better!

  • Why?

    • Very verbose

    • Write documentation about “what it does” and not “how you did it” (that is an implementation detail)

Side effects (careful!)¶

  • If a function changes the variables passed into it, then it is said to have side effects

  • Example:

def silly_sum(sri):
    sri.append(0)
    return sum(sri)
    
silly_sum([1,2,3,4])

Looks good, like it sums the numbers? But wait…

lst = [1,2,3,4]
silly_sum(lst)
silly_sum(lst)
lst
  • If you function has side effects like this, you must mention it in the documentation (later today).

  • In general avoid this!

Unit tests, corner cases¶

assert statements¶

  • assert statementS cause your program to fail if the condition is False.

  • They can be used as sanity checks for your program.

  • There are more sophisticated way to “test” your programs, beyond the scope of this course

  • The syntax is:

assert expression , "Error message if expression is False or raises an error."
assert 1 == 2 , "1 is not equal to 2."

Systematic Program Design¶

A systematic approach to program design is a general set of steps to follow when writing programs. Our approach includes:

  1. Write a stub: a function that does nothing but accept all input parameters and return the correct datatype.

  2. Write tests to satisfy the design specifications.

  3. Outline the program with pseudo-code.

  4. Write code and test frequently.

  5. Write documentation.

The key point: write tests BEFORE you write code.

  • You do not have to do this in MDS, but you may find it surprisingly helpful.

  • Often writing tests helps you think through what you are trying to accomplish.

  • It’s best to have that clear before you write the actual code.

Testing woes - false positives¶

  • Just because all your tests pass, this does not mean your program is correct!!

  • This happens all the time. How to deal with it?

    • Write a lot of tests!

    • Don’t be overconfident, even after writing a lot of tests!

def sample_median(x):
    """Finds the median of a list of numbers."""
    x_sorted = sorted(x)
    return x_sorted[len(x_sorted)//2]

assert sample_median([1,2,3,4,5]) == 3
assert sample_median([0,0,0,0]) == 0

Looks good? … ?






assert sample_median([1,2,3,4]) == 2.5






assert sample_median([1,3,2]) == 2

Testing woes - false negatives¶

  • It can also happen, though more rarely, that your tests fail but your program is correct.

  • This means there is something wrong with your test.

  • For example, in the autograding for lab1 this happened to some people, because of tiny roundoff errors.

Corner cases¶

  • A corner case is an input that is reasonable but a bit unusual, and may trip up your code.

  • For example, taking the median of an empty list, or a list with only one element.

  • Often it is desirable to add test cases to address corner cases.

assert sample_median([1]) == 1
  • In this case the code worked with no extra effort, but sometimes we need if statements to handle the weird cases.

  • Sometimes we want the code to throw an error (e.g. median of an empty list); more on this later.

Multiple return values¶

  • In most (all?) programming languages I’ve seen, functions can only return one thing.

  • That is technically true in Python, but there is a “workaround”, which is to return a tuple.

# not good from a design perspective!
def sum_and_product(x, y):
    return (x+y, x*y)
sum_and_product(5,6)

In some cases in Python, the parentheses can be omitted:

def sum_and_product(x, y):
    return x+y, x*y
sum_and_product(5,6)

It is common to store these in separate variables, so it really feels like the function is returning multiple values:

s, p = sum_and_product(5, 6)
s
p
  • Question: is this good function design.

  • Answer: usually not, but sometimes.

  • You will encounter this in some Python packages.

That’s it for today!

See you on Friday