Practice with ggplot - Part 2#

Introduction#

Learning Outcomes#

In this lab you will:

  • Use the ggplot library in R to generate data visualizations for the following plots

    • Faceted plots

    • Ridgeline plots

    • Violin plots

    • Grouped bar plots

    • Box plots

    • Jitter/strip plots

  • Apply effective design principles to data visualizations

  • Evaluate and critique data visualizations based on effective design principles

  • Extract insight(s) from a visualization

  • Summarize the benefits and disadvantages of two plot types showing the same data

Practice Problems (PP) - More plot types with ggplot#

** Please note that ALL the practice Problems are meant to be for practice and will not be graded by the TAs.**

The purpose of these practice problems is to help you learn the ggplot2 syntax and prepare you to answer the lab questions.

Packages you need to install#

  • ggridges

  • Lahman

  • viridis

  • cowplot

    • Install the cowplot package using install.packages("cowplot") in your R console (do not include any install code in this notebook)

    • You may need to install some dependencies:

      • Imagemagick is a pretty powerful tool for manipulating images and bitmaps, so I encourage you to spend a bit of time installing it. If you do have trouble, please try to complete the rest of the lab first, you should only need it for Exercise 8 and the practice problems in this lab.

      • On macOS: you may need to install imagemagick. The suggested method is using brew

      • On windows: you will likely not have imagemagick installed, I’d recommend the binary file from here if you need to install it

      • On Ubuntu: you likely already have imagemagick, but in case you don’t you can install the binary from here

PP 1 - Violin plot#

Task: Make a violin plot of the total annual hits since 2010 for the Toronto Bluejays. Your plot should look like the plot on the right.

PP 2 - Violin plot with jittered points#

Task: Add jittered plots to the violin plot you made above. Your plot should look like the plot on the right.

PP 3 - Faceted bar chart#

Task: Use the mpg dataset and make a bar plot of hwy vs cyl with a different panel for each year. The wrangling has been done for you, your plot should look like the plot on the right.

PP 4 - Stacked Bar chart#

Task: Use the ToothGrowth dataset and make a stacked bar plot of tooth length observations (len) in each supplement type for each dose

Notice how this is not an ideal way of visualize this data. Think about why that is…

Details about the toothgrowth dataset:

Description

“The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).”

Source: https://stat.ethz.ch/R-manual/R-patched/library/datasets/html/ToothGrowth.html

PP 5 - Faceted box plot#

Let’s try a faceted box plot instead to visualize this dataset. We will step through it and slowly improve it, layer by layer

Task: Use the ToothGrowth dataset and make a boxplot of len for the different types of supplement (supp) with a panel for each dose

PP 5.1 - Faceted Boxplot - change legend position to be below the x-axis#

Task: Modify the previous plot, changing the legend position to the bottom of the plot

PP 5.2 - Faceted Boxplot - improve legend labels#

Task: Modify the boxplot again, improving the legend label to show the full names of the supplements used

PP 5.3 - Facetted Boxplot - Add x and y-labels#

Task: Modify the x and y labels to make them more informative

PP 5.4 - Facetted Boxplot - rotate x-axis labels by 45 degrees#

Task: Rotate the x-axis labels 45 degrees

PP 6 - Ridgeline plot#

Task: Create a ridgeline plot of the mean temperature observations in Lincoln for each month. Add a color gradient based on the temperature values

Congratulations!