Similar Geoms
geom_dotplot()
Boxplots provide a quick way to represent a distribution, but they leave behind a lot of information. {ggplot2} supplements boxplots with two geoms that show more information.
The first is geom_dotplot()
. If you set the binaxis
parameter of geom_dotplot()
to "y"
, geom_dotplot()
behaves like geom_boxplot()
, display a separate distribution for each group of data.
Here each group functions like a vertical histogram. Add the parameter stackdir = "center"
then re-run the code. Can you interpret the results?
ggplot(data = mpg) +
geom_dotplot(mapping = aes(x = class, y = hwy), binaxis = "y",
dotsize = 0.5, binwidth = 1, stackdir = "center")
Good job! When you set stackdir = "center"
, geom_dotplot()
arranges each row of dots symmetrically around the \(x\) value. This layout will help you understand the next geom.
As in the histogram tutorial, it takes a lot of tweaking to make a dotplot look right. As a result, I tend to only use them when I want to make a point.
geom_violin()
geom_violin()
provides a second alternative to geom_boxplot()
. A violin plot uses densities to draw a smoothed version of the centered dotplot you just made.
You can think of a violin plot as an outline drawn around the edges of a centered dotplot. Each “violin” spans the range of the data. The violin is thick where there are many values, and thin where there are few.
Convert the plot below from a boxplot to a violin plot. Note that violin plots do not use the parameters you saw for dotplots.
ggplot(data = mpg) +
geom_violin(mapping = aes(x = class, y = hwy))
’Good job! Another way to interpret a violin plot is to mentally “push” the width of each violin all to one side (so the other side is a straight line). The result would be a density (e.g. geom_density()
) turned on its side for each distribution).
Exercise 7: Violin plots
You can further enhance violin plots by adding the parameter draw_quantiles = c(0.25, 0.5, 0.75)
. This will cause ggplot2 to draw horizontal lines across the violins at the 25th, 50th, and 75th percentiles. These are the same three horizontal lines that are displayed in a boxplot (the 25th and 75th percentiles are the bounds of the box, the 50th percentile is the median).
Add these lines to the violin plot below.
ggplot(data = mpg) +
geom_violin(mapping = aes(x = class, y = hwy), draw_quantiles = c(0.25, 0.5, 0.75))
Good job! Can you predict how you would use draw_quantiles
to draw a horizontal line at a different percentile, like the 60th percentile?.