Lab 3: Python Functions, Lists, and Dictionaries¶

In this lab you will be working on python basics.

This lab must be completed individually.

Where provided, try your best to match the Sample Output as best as you can.

Accept the lab¶

To accept this lab on GitHub Classroom, you must click this link.

Objectives¶

Practice Python loops and conditions
Practice Python lists and dictionaries
Practice string manipulation in Python
Practice importing and using the Pandas module

Part 1: Python Fundamentals (10 marks)¶

This part of the lab takes you through some of the fundamental components of Python.

1A: `if-elif-else` statements (2 marks)¶

Fill in missing pieces of the following code such that print statements make sense. You should replace <YOUR_CODE_HERE> with your code.

name = 'Jon kflk'

### Your solution here

if len(name)>18:
    print('Name "{}" is more than 18 chars long'.format(name))
    length_description = 'long'
elif len(name) > 15:
    print('Name "{}" is more than 15 chars long'.format(name))
    length_description = 'semi long'
elif len(name) > 12:
    print('Name "{}" is more than 12 chars long'.format(name))
    length_description = 'semi long'
elif 9 <= len(name) <= 11:
    print('Name "{}" is 9, 10 or 11 chars long'.format(name))
    length_description = 'semi short'
else:
    print('Name "{}" is a short name'.format(name))
    length_description = 'short'

Name "Jon kflk" is a short name

1B: `for` loops (2 marks)¶

Fill <YOUR_CODE_HERE> in the code snippet below so that this sample output is printed:

Sample Output¶

A
AA
AAA
AAAA
AAAAA
AAAA
AAA
AA
A

### Your solution here

n = 10
for i in range(1,n):
    if i < n/2:
        print("A" * i)
    else:
        print("A" * (n-i))

A
AA
AAA
AAAA
AAAAA
AAAA
AAA
AA
A

1C: `for` loops and `lists`¶

We have given you some sample code, as well as a list called data.

data = [53,9,5,90,63,5,97,40,92,48,53,8,38,63,13,15,66,81,57,79,42,91,25,89,66,4,73,45,80,17]

For each of the exercises 1C1 - 1CX, use the data list to answer the question by writing the appropriate bits of python code.

1C1. Print values that are within the upper and lower bounds of 15 and 40 (2 marks)¶

Hint: Use a for loop and loop through each of the elements in data

Sample output (it’s okay if yours is vertical rather than horizontally printed):¶

[40, 38, 15, 25, 17]

data = [53,9,5,90,63,5,97,40,92,48,53,8,38,63,13,15,66,81,57,79,42,91,25,89,66,4,73,45,80,17]

### Your solution here

for d in data:
  if 15 <= d <= 40:
    print(d)

1C2. Write code to calculate and print the maximum, minimum, sum, count, and average of items in `data` (4 marks)¶

Hint: for the mean, you will need to combine the sum and the count

Sample output¶

The max is: 97
The min is: 4
The sum is: 1507
The count is: 30
The mean is: 50.233333333333334

min(data)

### Your solution here

print('The max is: {0}'.format(max(data)))
print('The min is: {0}'.format(min(data)))
print('The sum is: {0}'.format(sum(data)))
print('The count is: {0}'.format(len(data)))
print('The mean is: {0}'.format(sum(data)/len(data)))

The max is: 97
The min is: 4
The sum is: 1507
The count is: 30
The mean is: 50.233333333333334

1D3. (Optional) Using list comprehension (NOT for loops), print out the numbers within the specified upper and lower bounds (inclusive) of 12 and 80 (0 marks).¶

Sample output:¶

[53, 63, 40, 48, 53, 38, 63, 13, 15, 66, 57, 79, 42, 25, 66, 73, 45, 80, 17]

### Your solution here

[d for d in data if 12 <= d <= 80]

[53, 63, 40, 48, 53, 38, 63, 13, 15, 66, 57, 79, 42, 25, 66, 73, 45, 80, 17]

Part 2: Working with data using pandas (12 marks)¶

In this part of the lab, we will practice loading in a sample data set using pandas, and doing some basic operations.

2A1. Load in the data (4 marks)¶

Here is the URL of the pokemon dataset inside the data folder.

Your task is to use the pandas read_csv() function to read this dataset, assign it to a dataframe called df, and then print its head also known as the first 5 lines of the dataframe.

Hint: don’t forget to first import pandas as pd to use read_csv and other pandas function.

### Your solution here
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/firasm/bits/master/pokemon.csv')
df.head()
# or 

# df = pd.read_csv('data/pokemon.csv')
# df.head()

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
0	1	Bulbasaur	Grass	Poison	318	45	49	49	65	65	45	1	False
1	2	Ivysaur	Grass	Poison	405	60	62	63	80	80	60	1	False
2	3	Venusaur	Grass	Poison	525	80	82	83	100	100	80	1	False
3	3	VenusaurMega Venusaur	Grass	Poison	625	80	100	123	122	120	80	1	False
4	4	Charmander	Fire	NaN	309	39	52	43	60	50	65	1	False

2A2. How many total pokemon are there in the dataset? (2 mark)¶

Make sure to use the dataframe.count() function to print the total number of entries in each column of the dataframe before you answer!

### Your solution here

df.count()

#             800
Name          800
Type 1        800
Type 2        414
Total         800
HP            800
Attack        800
Defense       800
Sp. Atk       800
Sp. Def       800
Speed         800
Generation    800
Legendary     800
dtype: int64

2A3. Create a new dataframe `df2` that only includes the Pokemon from the first generation. (2 marks)¶

Hint: Remember that you can subset dataframes using the [] syntax. More on this here

### Your solution here

df2 = df[df['Generation']==1]

df2

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
0	1	Bulbasaur	Grass	Poison	318	45	49	49	65	65	45	1	False
1	2	Ivysaur	Grass	Poison	405	60	62	63	80	80	60	1	False
2	3	Venusaur	Grass	Poison	525	80	82	83	100	100	80	1	False
3	3	VenusaurMega Venusaur	Grass	Poison	625	80	100	123	122	120	80	1	False
4	4	Charmander	Fire	NaN	309	39	52	43	60	50	65	1	False
...	...	...	...	...	...	...	...	...	...	...	...	...	...
161	149	Dragonite	Dragon	Flying	600	91	134	95	100	100	80	1	False
162	150	Mewtwo	Psychic	NaN	680	106	110	90	154	90	130	1	True
163	150	MewtwoMega Mewtwo X	Psychic	Fighting	780	106	190	100	154	100	130	1	True
164	150	MewtwoMega Mewtwo Y	Psychic	NaN	780	106	150	70	194	120	140	1	True
165	151	Mew	Psychic	NaN	600	100	100	100	100	100	100	1	False

166 rows × 13 columns

2A4. Print ONLY the mean HP, Attack, Defense, and Speed of all pokemon in the first generation using pandas functions (4 marks)¶

### Your solution here

df[df['Generation']==1][['HP','Attack','Defense','Speed']].mean()

# OR 

df2[['HP','Attack','Defense','Speed']].mean()

HP         65.819277
Attack     76.638554
Defense    70.861446
Speed      72.584337
dtype: float64

Part 3 - Dictionaries, Lists and data manapulation (10 marks)¶

In this part we explore another fundamental data structure in python, called Dictionaries.

3A. Create a dictionary that has 3 keys: name, age and salary, and enter in the following dummy information. (2 marks)¶

Sample output¶

{‘name’: ‘Tim Cook’, ‘age’: 59, ‘salary’: 3000000.0}

### Your solution here

test_dict = {'name':'Tim Cook',
             'age': 59,
             'salary':3E7}

test_dict

{'name': 'Tim Cook', 'age': 59, 'salary': 30000000.0}

3B. Create a second dictionary, this time with at least 5 different names, ages, and salaries. (2 marks)¶

Hint: There should only be three keys, and the values should be a list.

Sample Output¶

{‘name’: [‘Tim Cook’, ‘Person 2’, ‘Person 3’], ‘age’: [59, 24, 40], ‘salary’: [30000000.0, 200000.0, 900000.0]}

### Your solution here

test_dict = {'name':['Tim Cook','Person 2','Person 3'],
             'age': [59,24,40],
             'salary':[3E7,2E5,9E5]}

3C. Create a pandas dataframe using the dictionary you created above (2 marks)¶

Hint: Use the pd.DataFrame.from_dict() method

Sample output¶

lab_solutions/lab3_solutions/images/df.png

### Your solution here

df_salary = pd.DataFrame.from_dict(test_dict)

df_salary

	name	age	salary
0	Tim Cook	59	30000000.0
1	Person 2	24	200000.0
2	Person 3	40	900000.0

3D. Rename all three columns of the dataframe (4 mark)¶

Hint: you should use the pandas function rename() to accomplish this. You can rename it to whatever you like, just show us you are able to use this function

### Your solution here
df_salary.rename(columns={"name":"Full Name",
                          "age": "Age",
                          "salary": "Annual Salary"})

	Full Name	Age	Annual Salary
0	Tim Cook	59	30000000.0
1	Person 2	24	200000.0
2	Person 3	40	900000.0

Part 4 - More dictionary practice (4 marks - Bonus)¶

(Bonus) (3 marks) Create a Python program that takes a string text and calculates the frequency of each letter. Data set (copy as string into Python code)

text = “””Elephants are mammals of the family Elephantidae and the largest existing land animals. Three species are currently recognised: the African bush elephant, the African forest elephant, and the Asian elephant. Elephantidae is the only surviving family of the order Proboscidea; extinct members include the mastodons. The family Elephantidae also contains several now-extinct groups, including the mammoths and straight-tusked elephants. African elephants have larger ears and concave backs, whereas Asian elephants have smaller ears, and convex or level backs. Distinctive features of all elephants include a long trunk, tusks, large ear flaps, massive legs, and tough but sensitive skin. The trunk, also called a proboscis, is used for breathing, bringing food and water to the mouth, and grasping objects. Tusks, which are derived from the incisor teeth, serve both as weapons and as tools for moving objects and digging. The large ear flaps assist in maintaining a constant body temperature as well as in communication. The pillar-like legs carry their great weight. Elephants are scattered throughout sub-Saharan Africa, South Asia, and Southeast Asia and are found in different habitats, including savannahs, forests, deserts, and marshes. They are herbivorous, and they stay near water when it is accessible. They are considered to be keystone species, due to their impact on their environments.[1] Other animals tend to keep their distance from elephants; the exception is their predators such as lions, tigers, hyenas, and wild dogs, which usually target only young elephants (calves). Elephants have a fission–fusion society, in which multiple family groups come together to socialise. Females (cows) tend to live in family groups, which can consist of one female with her calves or several related females with offspring. The groups, which do not include bulls, are led by the (usually) oldest cow, known as the matriarch.”””

Hint: There are many ways to do this, the most “elegant” way uses a mixture of sets, and the Counter modules from collections

Sample output:¶

‘ ‘: Count of 293 and Percentage of 15.2%
‘(‘: Count of 3 and Percentage of 0.2%
‘)’: Count of 3 and Percentage of 0.2%
‘,’: Count of 32 and Percentage of 1.7%
‘-‘: Count of 4 and Percentage of 0.2%
‘.’: Count of 17 and Percentage of 0.9%
‘1’: Count of 1 and Percentage of 0.1%
‘:’: Count of 1 and Percentage of 0.1%
‘;’: Count of 2 and Percentage of 0.1%
‘A’: Count of 8 and Percentage of 0.4%
‘D’: Count of 1 and Percentage of 0.1%
‘E’: Count of 6 and Percentage of 0.3%
‘F’: Count of 1 and Percentage of 0.1%
‘O’: Count of 1 and Percentage of 0.1%
‘P’: Count of 1 and Percentage of 0.1%
‘S’: Count of 3 and Percentage of 0.2%
‘T’: Count of 9 and Percentage of 0.5%
‘[‘: Count of 1 and Percentage of 0.1%
‘]’: Count of 1 and Percentage of 0.1%
‘a’: Count of 147 and Percentage of 7.6%
‘b’: Count of 20 and Percentage of 1.0%
‘c’: Count of 55 and Percentage of 2.8%
‘d’: Count of 53 and Percentage of 2.7%
‘e’: Count of 189 and Percentage of 9.8%
‘f’: Count of 32 and Percentage of 1.7%
‘g’: Count of 37 and Percentage of 1.9%
‘h’: Count of 84 and Percentage of 4.4%
‘i’: Count of 109 and Percentage of 5.6%
‘j’: Count of 2 and Percentage of 0.1%
‘k’: Count of 12 and Percentage of 0.6%
‘l’: Count of 80 and Percentage of 4.1%
‘m’: Count of 35 and Percentage of 1.8%
‘n’: Count of 114 and Percentage of 5.9%
‘o’: Count of 87 and Percentage of 4.5%
‘p’: Count of 34 and Percentage of 1.8%
‘r’: Count of 92 and Percentage of 4.8%
‘s’: Count of 131 and Percentage of 6.8%
‘t’: Count of 120 and Percentage of 6.2%
‘u’: Count of 41 and Percentage of 2.1%
‘v’: Count of 22 and Percentage of 1.1%
‘w’: Count of 19 and Percentage of 1.0%
‘x’: Count of 5 and Percentage of 0.3%
‘y’: Count of 21 and Percentage of 1.1%
‘–’: Count of 1 and Percentage of 0.1%

text = """Elephants are mammals of the family Elephantidae and the largest existing land animals. Three species are currently recognised: the African bush elephant, the African forest elephant, and the Asian elephant. Elephantidae is the only surviving family of the order Proboscidea; extinct members include the mastodons. The family Elephantidae also contains several now-extinct groups, including the mammoths and straight-tusked elephants. African elephants have larger ears and concave backs, whereas Asian elephants have smaller ears, and convex or level backs. Distinctive features of all elephants include a long trunk, tusks, large ear flaps, massive legs, and tough but sensitive skin. The trunk, also called a proboscis, is used for breathing, bringing food and water to the mouth, and grasping objects. Tusks, which are derived from the incisor teeth, serve both as weapons and as tools for moving objects and digging. The large ear flaps assist in maintaining a constant body temperature as well as in communication. The pillar-like legs carry their great weight. Elephants are scattered throughout sub-Saharan Africa, South Asia, and Southeast Asia and are found in different habitats, including savannahs, forests, deserts, and marshes. They are herbivorous, and they stay near water when it is accessible. They are considered to be keystone species, due to their impact on their environments.[1] Other animals tend to keep their distance from elephants; the exception is their predators such as lions, tigers, hyenas, and wild dogs, which usually target only young elephants (calves). Elephants have a fission–fusion society, in which multiple family groups come together to socialise. Females (cows) tend to live in family groups, which can consist of one female with her calves or several related females with offspring. The groups, which do not include bulls, are led by the (usually) oldest cow, known as the matriarch."""

### Your solution here

from collections import Counter
counts=Counter(text)

for i in sorted(set(text)):
    print("'{0}': Count of {1} and Percentage of {2:.1f}%".format(i,counts[i],100*counts[i]/len(text)))

' ': Count of 293 and Percentage of 15.2%
'(': Count of 3 and Percentage of 0.2%
')': Count of 3 and Percentage of 0.2%
',': Count of 32 and Percentage of 1.7%
'-': Count of 4 and Percentage of 0.2%
'.': Count of 17 and Percentage of 0.9%
'1': Count of 1 and Percentage of 0.1%
':': Count of 1 and Percentage of 0.1%
';': Count of 2 and Percentage of 0.1%
'A': Count of 8 and Percentage of 0.4%
'D': Count of 1 and Percentage of 0.1%
'E': Count of 6 and Percentage of 0.3%
'F': Count of 1 and Percentage of 0.1%
'O': Count of 1 and Percentage of 0.1%
'P': Count of 1 and Percentage of 0.1%
'S': Count of 3 and Percentage of 0.2%
'T': Count of 9 and Percentage of 0.5%
'[': Count of 1 and Percentage of 0.1%
']': Count of 1 and Percentage of 0.1%
'a': Count of 147 and Percentage of 7.6%
'b': Count of 20 and Percentage of 1.0%
'c': Count of 55 and Percentage of 2.8%
'd': Count of 53 and Percentage of 2.7%
'e': Count of 189 and Percentage of 9.8%
'f': Count of 32 and Percentage of 1.7%
'g': Count of 37 and Percentage of 1.9%
'h': Count of 84 and Percentage of 4.4%
'i': Count of 109 and Percentage of 5.6%
'j': Count of 2 and Percentage of 0.1%
'k': Count of 12 and Percentage of 0.6%
'l': Count of 80 and Percentage of 4.1%
'm': Count of 35 and Percentage of 1.8%
'n': Count of 114 and Percentage of 5.9%
'o': Count of 87 and Percentage of 4.5%
'p': Count of 34 and Percentage of 1.8%
'r': Count of 92 and Percentage of 4.8%
's': Count of 131 and Percentage of 6.8%
't': Count of 120 and Percentage of 6.2%
'u': Count of 41 and Percentage of 2.1%
'v': Count of 22 and Percentage of 1.1%
'w': Count of 19 and Percentage of 1.0%
'x': Count of 5 and Percentage of 0.3%
'y': Count of 21 and Percentage of 1.1%
'–': Count of 1 and Percentage of 0.1%

DATA 301

Lab 3: Python Functions, Lists, and Dictionaries¶

Accept the lab¶

Objectives¶

Part 1: Python Fundamentals (10 marks)¶

1A: if-elif-else statements (2 marks)¶

1B: for loops (2 marks)¶

Sample Output¶

1C: for loops and lists¶

1C1. Print values that are within the upper and lower bounds of 15 and 40 (2 marks)¶

Sample output (it’s okay if yours is vertical rather than horizontally printed):¶

1C2. Write code to calculate and print the maximum, minimum, sum, count, and average of items in data (4 marks)¶

Sample output¶

1D3. (Optional) Using list comprehension (NOT for loops), print out the numbers within the specified upper and lower bounds (inclusive) of 12 and 80 (0 marks).¶

Sample output:¶

Part 2: Working with data using pandas (12 marks)¶

2A1. Load in the data (4 marks)¶

2A2. How many total pokemon are there in the dataset? (2 mark)¶

2A3. Create a new dataframe df2 that only includes the Pokemon from the first generation. (2 marks)¶

2A4. Print ONLY the mean HP, Attack, Defense, and Speed of all pokemon in the first generation using pandas functions (4 marks)¶

Part 3 - Dictionaries, Lists and data manapulation (10 marks)¶

3A. Create a dictionary that has 3 keys: name, age and salary, and enter in the following dummy information. (2 marks)¶

Sample output¶

3B. Create a second dictionary, this time with at least 5 different names, ages, and salaries. (2 marks)¶

Sample Output¶

3C. Create a pandas dataframe using the dictionary you created above (2 marks)¶

Sample output¶

3D. Rename all three columns of the dataframe (4 mark)¶

Part 4 - More dictionary practice (4 marks - Bonus)¶

Sample output:¶

1A: `if-elif-else` statements (2 marks)¶

1B: `for` loops (2 marks)¶

1C: `for` loops and `lists`¶

1C2. Write code to calculate and print the maximum, minimum, sum, count, and average of items in `data` (4 marks)¶

2A3. Create a new dataframe `df2` that only includes the Pokemon from the first generation. (2 marks)¶