Similar Geoms

geom_dotplot()

Boxplots provide a quick way to represent a distribution, but they leave behind a lot of information. {ggplot2} supplements boxplots with two geoms that show more information.

The first is geom_dotplot(). If you set the binaxis parameter of geom_dotplot() to "y", geom_dotplot() behaves like geom_boxplot(), display a separate distribution for each group of data.

Here each group functions like a vertical histogram. Add the parameter stackdir = "center" then re-run the code. Can you interpret the results?

ggplot(data = mpg) +
  geom_dotplot(mapping = aes(x = class, y = hwy), binaxis = "y", 
               dotsize = 0.5, binwidth = 1, stackdir = "center")

Good job! When you set stackdir = "center", geom_dotplot() arranges each row of dots symmetrically around the \(x\) value. This layout will help you understand the next geom.

As in the histogram tutorial, it takes a lot of tweaking to make a dotplot look right. As a result, I tend to only use them when I want to make a point.

geom_violin()

geom_violin() provides a second alternative to geom_boxplot(). A violin plot uses densities to draw a smoothed version of the centered dotplot you just made.

You can think of a violin plot as an outline drawn around the edges of a centered dotplot. Each “violin” spans the range of the data. The violin is thick where there are many values, and thin where there are few.

Convert the plot below from a boxplot to a violin plot. Note that violin plots do not use the parameters you saw for dotplots.

ggplot(data = mpg) +
  geom_violin(mapping = aes(x = class, y = hwy))

’Good job! Another way to interpret a violin plot is to mentally “push” the width of each violin all to one side (so the other side is a straight line). The result would be a density (e.g. geom_density()) turned on its side for each distribution).

Exercise 7: Violin plots

You can further enhance violin plots by adding the parameter draw_quantiles = c(0.25, 0.5, 0.75). This will cause ggplot2 to draw horizontal lines across the violins at the 25th, 50th, and 75th percentiles. These are the same three horizontal lines that are displayed in a boxplot (the 25th and 75th percentiles are the bounds of the box, the 50th percentile is the median).

Add these lines to the violin plot below.

ggplot(data = mpg) +
  geom_violin(mapping = aes(x = class, y = hwy), draw_quantiles = c(0.25, 0.5, 0.75))

Good job! Can you predict how you would use draw_quantiles to draw a horizontal line at a different percentile, like the 60th percentile?.

Next Topic