Course Schedule

Class #

Date

Course Topics

Readings

1

Wednesday September 9

Course and Platform Introduction

Lecture 1

2

Monday September 14

Data Formats

Lecture 2

3

Wednesday September 16

Introduction to Excel

Lecture 3

4

Monday September 21

Excel Data Analysis

Lecture 4

5

Wednesday September 23

Jupyter Lab, VS Code, GitHub Pages

Lecture 5

6

Monday September 28

Online Services

Lecture 6

7

Wednesday September 30

Quiz 1

Lecture 7

8

Monday October 5

Debugging

Lecture 8

9

Wednesday October 7

Quiz 2

Lecture 9

Lecture Learning Outcomes

  1. Course and Platform Introduction

    • understand MDS structure and goals

    • list the different software used and their purpose

    • define command line and list some of its uses

    • know how to open the command line window on Mac OS and Windows

    • define: file system, folder, file

    • explain the difference between an absolute and relative path

    • define Git/GitHub and explain why they are useful for analysts

    • access the web interface of GitHub

    • using command line clone repository, add files, commit changes, and push to origin

    • view and edit text files directly on GitHub

  2. Data Formats

    • define: computer, software, memory, data, memory size/data size, cloud

    • explain “Big Data” and describe data growth

    • compare and contrast: digital versus analog

    • briefly explain how integers, doubles, and strings are encoded

    • explain why ASCII table is required for character encoding

    • explain why Unicode is used in certain situations instead of ASCII

    • explain the role of metadata for interpreting data

    • define: file, file encoding, text file, binary file

    • discuss the time-versus-space tradeoff

  3. Introduction to Excel

    • explain what a spreadsheet is

    • explain how cells are addressed in a spreadsheet

    • list some of the ways to select cells in a spreadsheet

    • define and explain: formula, function, argument, concatenation

    • use these functions: concatenate, lookup, index

    • explain the difference between an absolute and relative address

    • explain how an aggregate function works and list examples

    • Evaluate and create conditions. Use IF() to make decisions.

    • Explain how to use conditional formatting.

  4. Excel Data Analysis

    • explain how spreadsheets can be used as a database

    • use sorting and filtering

    • create and edit charts and use chart features: trendlines, sparklines

    • explain the usefulness of: what-if scenarios, goal seek, solver

    • use and create pivot tables and charts

    • use the Analysis ToolPak including computing a regression

  5. Interactive Session: Jupyter Lab, VS Code, and GitHub pages

    • explain how/what jupyter lab and jupyter notebooks work

    • explore the Jupyter Lab interface and ecosystem

    • discuss what nbconvert is and does

    • discuss workflows with VS Code

    • deploy your own personal website (if time permits)

  6. Online Services

    • explain motivation for a data analyst to use cloud services for their work.

    • list some of the cloud service companies and the services they provide.

    • use Slack for group communication.

    • use Google Apps for collaborative document editing

    • list some trade-offs of using cloud services versus building using in-house tools such as R and Python

    • explain the role of Google Analytics and Google AdWords. Compare and contrast what these two services provide.

  7. Debugging, Libraries, Documentation, and Help

    • define: debugging

    • list and explain the 4 key steps of debugging.

    • list and identify three types of program errors: syntax errors, logic errors, and exceptions.

    • be able to debug Python and R code.

    • understand different resources and techniques for getting help in R and Python.

    • read and apply standard documentation for R and Python.

    • understand the importance of using libraries to improve code development efficiency and reliability.

    • practice finding a library, reading its documentation, and applying its methods to solve a problem.