Schedule#

Class #

Date

Course Topics

1

Wednesday September 8

Course and Platform Introduction

2

Monday September 13

Data Formats

3

Wednesday September 15

Introduction to Excel

4

Monday September 20

Excel Data Analysis

5

Wednesday September 22

Excel Macros and Automation

6

Monday September 27

Online Services

7

Wednesday September 29

Debugging

8

Monday October 4

Quiz 1 - Timed, online

9

Wednesday October 6

Quiz 2 - Take-home

Course Learning Outcomes#

  1. Course and Platform Introduction

    • understand MDS structure and goals

    • list the different software used and their purpose

    • define command line and list some of its uses

    • know how to open the command line window on Mac OS and Windows

    • define: file system, folder, file

    • explain the difference between an absolute and relative path

    • define Git/GitHub and explain why they are useful for analysts

    • access the web interface of GitHub

    • using command line clone repository, add files, commit changes, and push to origin

    • view and edit text files directly on GitHub

  2. Data Formats

    • define: computer, software, memory, data, memory size/data size, cloud

    • explain β€œBig Data” and describe data growth

    • compare and contrast: digital versus analog

    • briefly explain how integers, doubles, and strings are encoded

    • explain why ASCII table is required for character encoding

    • explain why Unicode is used in certain situations instead of ASCII

    • explain the role of metadata for interpreting data

    • define: file, file encoding, text file, binary file

    • discuss the time-versus-space tradeoff

  3. Introduction to Excel

    • explain what a spreadsheet is

    • explain how cells are addressed in a spreadsheet

    • list some of the ways to select cells in a spreadsheet

    • define and explain: formula, function, argument, concatenation

    • use these functions: concatenate, lookup, index

    • explain the difference between an absolute and relative address

    • explain how an aggregate function works and list examples

    • Evaluate and create conditions. Use IF() to make decisions.

    • Explain how to use conditional formatting.

  4. Excel Data Analysis

    • explain how spreadsheets can be used as a database

    • use sorting and filtering

    • create and edit charts and use chart features: trendlines, sparklines

    • explain the usefulness of: what-if scenarios, goal seek, solver

    • use and create pivot tables and charts

    • use the Analysis ToolPak including computing a regression

  5. Excel Macros and Automation

    • list some reasons to use Excel VBA

    • define macro and explain the benefit of using macros

    • be able to record and execute a macro

    • explain the security issues with macros and how Excel deals with them

    • list and explain the use of the four main windows of the Visual Basic Editor

    • explain the role of the object browser

    • use the Range object to select a group of cells

    • create user-defined functions and use them in formulas

    • understand that Excel structures everything as objects with methods and properties

  6. Online Services

    • explain motivation for a data analyst to use cloud services for their work.

    • list some of the cloud service companies and the services they provide.

    • use Slack for group communication.

    • use Google Apps for collaborative document editing

    • list some trade-offs of using cloud services versus building using in-house tools such as R and Python

    • explain the role of Google Analytics and Google AdWords. Compare and contrast what these two services provide.

  7. Debugging, Libraries, and Documentation

    • define: debugging

    • list and explain the 4 key steps of debugging.

    • list and identify three types of program errors: syntax errors, logic errors, and exceptions.

    • be able to debug Python and R code.

    • understand different resources and techniques for getting help in R and Python.

    • read and apply standard documentation for R and Python.

    • understand the importance of using libraries to improve code development efficiency and reliability.

    • practice finding a library, reading its documentation, and applying its methods to solve a problem.