Practice with ggplot - Part 2
Contents
Practice with ggplot - Part 2#
Introduction#
Learning Outcomes#
In this lab you will:
Use the
ggplot
library inR
to generate data visualizations for the following plotsFaceted plots
Ridgeline plots
Violin plots
Grouped bar plots
Box plots
Jitter/strip plots
Apply effective design principles to data visualizations
Evaluate and critique data visualizations based on effective design principles
Extract insight(s) from a visualization
Summarize the benefits and disadvantages of two plot types showing the same data
Practice Problems (PP) - More plot types with ggplot#
** Please note that ALL the practice Problems are meant to be for practice and will not be graded by the TAs.**
The purpose of these practice problems is to help you learn the ggplot2
syntax and prepare you to answer the lab questions.
Packages you need to install#
ggridges
Lahman
viridis
cowplot
Install the
cowplot
package usinginstall.packages("cowplot")
in your R console (do not include any install code in this notebook)You may need to install some dependencies:
Imagemagick
is a pretty powerful tool for manipulating images and bitmaps, so I encourage you to spend a bit of time installing it. If you do have trouble, please try to complete the rest of the lab first, you should only need it for Exercise 8 and the practice problems in this lab.On macOS: you may need to install
imagemagick
. The suggested method is using brewOn windows: you will likely not have
imagemagick
installed, I’d recommend the binary file from here if you need to install itOn Ubuntu: you likely already have
imagemagick
, but in case you don’t you can install the binary from here
PP 1 - Violin plot#
Task: Make a violin plot of the total annual hits since 2010 for the Toronto Bluejays. Your plot should look like the plot on the right.
PP 2 - Violin plot with jittered points#
Task: Add jittered plots to the violin plot you made above. Your plot should look like the plot on the right.
PP 3 - Faceted bar chart#
Task: Use the mpg
dataset and make a bar plot of hwy
vs cyl
with a different panel for each year. The wrangling has been done for you, your plot should look like the plot on the right.
PP 4 - Stacked Bar chart#
Task: Use the ToothGrowth
dataset and make a stacked bar plot of tooth length observations (len
) in each supplement type for each dose
Notice how this is not an ideal way of visualize this data. Think about why that is…
Details about the toothgrowth dataset:
Description
“The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).”
Source: https://stat.ethz.ch/R-manual/R-patched/library/datasets/html/ToothGrowth.html
PP 5 - Faceted box plot#
Let’s try a faceted box plot instead to visualize this dataset. We will step through it and slowly improve it, layer by layer
Task: Use the ToothGrowth
dataset and make a boxplot of len
for the different types of supplement (supp
) with a panel for each dose
PP 5.1 - Faceted Boxplot - change legend position to be below the x-axis#
Task: Modify the previous plot, changing the legend position to the bottom of the plot
PP 5.2 - Faceted Boxplot - improve legend labels#
Task: Modify the boxplot again, improving the legend label to show the full names of the supplements used
PP 5.3 - Facetted Boxplot - Add x and y-labels#
Task: Modify the x and y labels to make them more informative
PP 5.4 - Facetted Boxplot - rotate x-axis labels by 45 degrees#
Task: Rotate the x-axis labels 45 degrees
PP 6 - Ridgeline plot#
Task: Create a ridgeline plot of the mean temperature observations in Lincoln for each month. Add a color gradient based on the temperature values
Congratulations!