Practice with ggplot - Part 2¶
Introduction¶
Learning Outcomes¶
In this lab you will:
Use the
ggplot
library inR
to generate data visualizations for the following plotsFaceted plots
Ridgeline plots
Violin plots
Grouped bar plots
Box plots
Jitter/strip plots
Apply effective design principles to data visualizations
Evaluate and critique data visualizations based on effective design principles
Extract insight(s) from a visualization
Summarize the benefits and disadvantages of two plot types showing the same data
Practice Problems (PP) - More plot types with ggplot¶
** Please note that ALL the practice Problems are meant to be for practice and will not be graded by the TAs.**
The purpose of these practice problems is to help you learn the ggplot2
syntax and prepare you to answer the lab questions.
Packages you need to install¶
ggridges
Lahman
viridis
cowplot
Install the
cowplot
package usinginstall.packages("cowplot")
in your R console (do not include any install code in this notebook)You may need to install some dependencies:
Imagemagick
is a pretty powerful tool for manipulating images and bitmaps, so I encourage you to spend a bit of time installing it. If you do have trouble, please try to complete the rest of the lab first, you should only need it for Exercise 8 and the practice problems in this lab.On macOS: you may need to install
imagemagick
. The suggested method is using brewOn windows: you will likely not have
imagemagick
installed, I’d recommend the binary file from here if you need to install itOn Ubuntu: you likely already have
imagemagick
, but in case you don’t you can install the binary from here
PP 1 - Violin plot¶
Task: Make a violin plot of the total annual hits since 2010 for the Toronto Bluejays. Your plot should look like the plot on the right.
PP 2 - Violin plot with jittered points¶
Task: Add jittered plots to the violin plot you made above. Your plot should look like the plot on the right.
PP 3 - Faceted bar chart¶
Task: Use the mpg
dataset and make a bar plot of hwy
vs cyl
with a different panel for each year. The wrangling has been done for you, your plot should look like the plot on the right.
PP 4 - Stacked Bar chart¶
Task: Use the ToothGrowth
dataset and make a stacked bar plot of tooth length observations (len
) in each supplement type for each dose
Notice how this is not an ideal way of visualize this data. Think about why that is…
Details about the toothgrowth dataset:
Description
“The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).”
Source: https://stat.ethz.ch/R-manual/R-patched/library/datasets/html/ToothGrowth.html
PP 5 - Faceted box plot¶
Let’s try a faceted box plot instead to visualize this dataset. We will step through it and slowly improve it, layer by layer
Task: Use the ToothGrowth
dataset and make a boxplot of len
for the different types of supplement (supp
) with a panel for each dose
PP 5.1 - Faceted Boxplot - change legend position to be below the x-axis¶
Task: Modify the previous plot, changing the legend position to the bottom of the plot
PP 5.2 - Faceted Boxplot - improve legend labels¶
Task: Modify the boxplot again, improving the legend label to show the full names of the supplements used
PP 5.3 - Facetted Boxplot - Add x and y-labels¶
Task: Modify the x and y labels to make them more informative
PP 5.4 - Facetted Boxplot - rotate x-axis labels by 45 degrees¶
Task: Rotate the x-axis labels 45 degrees
PP 6 - Ridgeline plot¶
Task: Create a ridgeline plot of the mean temperature observations in Lincoln for each month. Add a color gradient based on the temperature values
Congratulations!