Modern ggplot2

Teun van den Brand

Recent release of new major version

https://www.tidyverse.org/blog/2025/09/ggplot2-4-0-0/

Today’s topics

Headings
Patterns & gradients
Delayed evaluation
Polar coordinates
Facets

Headings

Better default titles for variables

Column metadata
Data dictionary

Column metadata

‘Pretty labels’ implemented as "label" attribute in columns.
Implemented in Hmisc, tinylabels, haven, labelled & sjlabelled

df <- mtcars

df$mpg <- haven::labelled(df$mpg, label = "Miles per gallon")

head(df$mpg)
## <labelled<double>[6]>: Miles per gallon
## [1] 21.0 21.0 22.8 21.4 18.7 18.1

attr(df$mpg, "label")
## [1] "Miles per gallon"

Column metadata

‘Pretty labels’ implemented as "label" attribute in columns.
Implemented in Hmisc, tinylabels, haven, labelled & sjlabelled
Careful with label attribute stability

df <- mtcars

attr(df$mpg, "label") <- "Miles per gallon"

head(df$mpg)
## [1] 21.0 21.0 22.8 21.4 18.7 18.1

vctrs::vec_slice(df$mpg, 1:6)
## [1] 21.0 21.0 22.8 21.4 18.7 18.1
## attr(,"label")
## [1] "Miles per gallon"

Column metadata

Label attribute automatically detected.

library(ggplot2)
library(patchwork)

df <- mtcars
attr(df$mpg, "label") <- "Miles per gallon"

ggplot(df, aes(mpg, disp)) +
  geom_point()

Data dictionary

Example dictionary for mtcars

dict <- tibble::tribble(
  ~column, ~label,                ~unit,    ~note,
  "mpg",   "Efficiency",          "mi/gal", "Gallons are US gallons",
  "cyl",   "Number of cylinders", "",       "",
  "disp",  "Engine Displacement", "in^3",   "",
  "am",    "Transmission",        "",       "0 = automatic, 1 = manual"
  # Additional rows for the other variables
)
dict

# A tibble: 4 × 4
  column label               unit     note                       
  <chr>  <chr>               <chr>    <chr>                      
1 mpg    Efficiency          "mi/gal" "Gallons are US gallons"   
2 cyl    Number of cylinders ""       ""                         
3 disp   Engine Displacement "in^3"   ""                         
4 am     Transmission        ""       "0 = automatic, 1 = manual"

Data dictionary

Preparing the dictionary for ggplot2 labels

# Format label as named vector
named_vec <- setNames(dict$label, dict$column)
# or:
named_vec <- dplyr::pull(dict, label, name = column)

named_vec
##                   mpg                   cyl                  disp 
##          "Efficiency" "Number of cylinders" "Engine Displacement" 
##                    am 
##        "Transmission"

Data dictionary

ggplot(mtcars, aes(mpg, disp, colour = cyl)) +
  geom_point() +
  labs(dictionary = named_vec)

Pros

Label variables directly, rather than aesthetics
Rewards habit of annotating data
Reusable within document

ggplot(mtcars, aes(cyl, mpg, group = cyl)) +
  geom_boxplot() +
  labs(dictionary = named_vec)

Pros

Label variables directly, rather than aesthetics
Rewards habit of annotating data
Reusable within document

Cons

Extra effort for ‘naked’ data
Expressions like factor(cyl) or cyl + 1 do not get automatic labels

Headings: summary

attr(data$var, "label")
labs(dictionary)

Exercise 2.1

05:00

Patterns and gradients

In R 4.1 the grid package introduced patterns and gradients.

grid::linearGradient()
grid::radialGradient()
grid::pattern()

We allow these as fill aesthetic in ggplot2.
Patterns can aid in cases of colour vision deficiency.

Gradients

Simple examples of linear and radial gradients.

# A vector of 15 colours
colours <- rev(hcl.colors(15, "Sunset"))

library(grid)

linear <- linearGradient(
  colours = colours,
  # Parametrised like a rectangle
  x1 = 0.5, x2 = 0.5,
  y1 = 0.0, y2 = 1.0
)

radial <- radialGradient(
  colours = colours, 
  # Parametrised like two circles
  cx1 = 0.8, cy1 = 0.8, r1 = 0.2,
  cx2 = 0.5, cy2 = 0.5, r2 = 0.5,
  # Draw separately for every area
  group = FALSE
)

Gradients

Use these gradients by providing them as a list.

p <- ggplot(mtcars) +
  aes(factor(vs))

p1 <- p + geom_bar(fill = list(linear))
p2 <- p + geom_bar(fill = list(radial))

p1 / p2

Ribbon gradient

Ribbon geometries now render a varying fill aesthetic as a gradient.

ggplot(economics) +
  aes(date, unemploy, fill = uempmed) +
  geom_area()

Patterns

Patterns are more complicated. You may need to know a little bit of grid to get these right. Here we’re using a diagonal line as a pattern.

width <- height <- unit(3, "mm")

inner_drawing <- segmentsGrob(
  # Diagonal line
  x0 = 0, x1 = 1,
  y0 = 0, y1 = 1,
  gp = gpar(col = "black"),
  vp = viewport(width = width, height = height)
)

hatching <- pattern(
  inner_drawing,
  width = width, height = height,
  extend = "repeat"
)

Patterns

Like gradients, patterns can be given as a list.

ggplot(mtcars) +
  aes(factor(cyl)) +
  geom_bar(
    fill = list(hatching),
    colour = "black"
  )

Patterns

To ‘fix’ patterning artefacts, you may need to adjust the strokes in the inner drawing.

width <- height <- unit(3, "mm")
inner_drawing <- segmentsGrob(
  x0 = c(-1, -1, 0), x1 = c(1, 2, 2),
  y0 = c(0, -1, -1), y1 = c(2, 2, 1),
  gp = gpar(col = "black"),
  vp = viewport(width = width, height = height)
)
hatching <- pattern(
  inner_drawing,
  width = width, height = height,
  extend = "repeat"
)

Patterns

Using patterns as a scale.

ggplot(mtcars) +
  aes(factor(cyl), fill = factor(cyl)) +
  geom_bar(colour = "black") +
  scale_fill_manual(
    values = list(linear, radial, hatching)
  )

Patterns galore

Using the gridpattern package to easily generate patterns.

herringbone <- gridpattern::patternFill(
  pattern = "polygon_tiling",
  type = "herringbone",
  spacing = 0.2,
  units = "cm",
  colour = "grey40",
  linewidth = 0.3
)
hexagons <- gridpattern::patternFill(
  pattern = "polygon_tiling",
  type = "hexagonal",
  spacing = 0.2,
  units = "cm",
  colour = "grey40",
  linewidth = 0.3
)
waves <- gridpattern::patternFill(
  pattern = "wave",
  spacing = 0.2,
  units = "cm",
  colour = "grey40",
  linewidth = 0.3
)

Patterns galore

Using the gridpattern package to easily generate patterns.

ggplot(mtcars) +
  aes(factor(cyl), fill = factor(cyl)) +
  geom_bar(colour = "grey40") +
  scale_fill_manual(
    values = list(herringbone, hexagons, waves)
  )

Patterns galore

Parametrised patterns with the ggpattern package.

library(ggpattern)

ggplot(mtcars) +
  aes(
    x = factor(cyl),
    pattern_spacing = cyl
  ) +
  geom_bar_pattern(
    pattern_fill = "black", 
    colour = "black", fill = "white"
  ) +
  scale_pattern_spacing_continuous(
    range = c(0.02, 0.05)
  )

Patterns galore

Emoji isotype plot using text patterns.

Code

# Helper function
width <- unit(20, "pt")
patternise_text <- function(text) {
  lapply(text, function(string) {
    grob <- textGrob(string, x = 0, hjust = 0, gp = gpar(fontsize = 18))
    pattern(
      grob,
      x = 0, hjust = 0,
      width = width, 
      extend = "repeat", 
      # Center text per bar using height/group
      height = unit(1, "npc"),
      group = FALSE
    )
  })
}

# Stats for the Netherlands
df <- data.frame(
  animal = c("chickens", "pigs",   "cows", "sheep", "goats", "humans"),
  amount = c(99900000,   11400000, 3800000, 850000, 480000,  17990000)
)

ggplot(df, aes(amount, animal, fill = animal)) +
  geom_col() +
  scale_fill_manual(
    values = patternise_text(c(
      "chickens" = "🐓",
      "pigs"     = "🐖",
      "cows"     = "🐄",
      "sheep"    = "🐑",
      "goats"    = "🐐",
      "humans"   = "🧍"
    ))
  ) +
  scale_x_continuous(
    labels = scales::label_number(scale = 1e-6, suffix = "M")
  ) +
  theme(
    legend.key.width = width,
    legend.key.height = unit(18, "pt") # see fontsize in pattern
  )

Patterns and gradients: summary

grid for most control over patterns.
- grid::pattern(), grid::linearGradient(), grid::radialGradient()
gridpattern for preformatted patterns.
- gridpattern::patternFill()
ggpattern for mapping data to patterns.
- Aesthetics (pattern_density)
- Geom layers (ggpattern::geom_boxplot_pattern())
- Scales (ggpattern::scale_pattern_density_continuous())

Exercise 2.2

05:00

Delayed evaluation

With regards to evaluation, there are three stages:

Direct input at start
After computing stat
After scale mapping

Direct input

Data available from the start, when mapped from data columns.

✅ aes(x = displ, y = hwy)

❌ Unmapped aesthetics like geom_bar(fill = "red")

❌ Data columns that are not included in aes()

After computing stat

In addition to aesthetics, computed variables become available.

Section in e.g. ?stat_density
Accessible via after_stat()
Redirection in Stat*$default_aes

# Inspect default aesthetics
StatDensity$default_aes

Aesthetic mapping: 
* `x`      -> `after_stat(density)`
* `y`      -> `after_stat(density)`
* `fill`   -> NA
* `weight` -> NULL

After computing stat

Using after_stat() yourself to redirect/modify computed variables.

ggplot(mpg, aes(displ, drv)) +
  stat_density(
    geom = "tile", position = "identity",
    aes(fill = after_stat(scaled))
  )

After computing stat

You may have run into a histogram/density misalignment problem.

binwidth <- 2

ggplot(faithful, aes(waiting)) +
  geom_histogram(binwidth = binwidth) +
  geom_density()

After computing stat

This can be fixed by using the density computed variable in the histogram.

ggplot(faithful, aes(waiting)) +
  geom_histogram(
    aes(y = after_stat(density)), 
    binwidth = binwidth
  ) +
  geom_density()

After computing stat

Or scaling the count computed variable in the density.

ggplot(faithful, aes(waiting)) +
  geom_histogram(binwidth = binwidth) +
  geom_density(
    aes(y = after_stat(count * binwidth))
  )

After scales

At this stage in the plot, we have mapped variables.

Determined by scale’s palette
Values now have graphical interpretation
- colour: "#4B0055"
- size: 12
- shape: "circle filled"/21
- linetype: "solid"/1
Access via after_scale()

After scales

A typical use of after_scale() is to derive colours from colour to fill or vice versa.

ggplot(mpg, aes(displ, fill = drv)) +
  geom_density(
    aes(colour = after_scale(
      scales::col_mix(fill, "black")
    ))
  )

After scales

A nice benefit of using after_scale() is that you derive colours, so you can still swap out scales.

last_plot() + 
  scale_fill_viridis_d()

After scales

Another use case can be to create half-geometries.

ggplot(mpg, aes(class, displ)) +
  geom_boxplot(aes(xmin = after_scale(x)), staplewidth = 0.3) +
  geom_violin(aes(xmax = after_scale(x)))

Staging

When you need a combination of direct input, after stat or after scale modifications, you can use stage().

stage(x) is equivalent to x.
stage(after_stat = x) is equivalent to after_stat(x).
stage(after_scale = x) is equivalent to after_scale(x).

Staging

A typical use case is when you want to initialise the aesthetic with one column, and later modify the mapped values.

ggplot(mpg, aes(drv, displ)) +
  geom_violin(
    aes(fill = stage(
      start = drv, 
      after_scale = scales::col_mix(fill, "white")
    ))
  )

Staging

Another use case is to reposition labels after computing a statistic.

ggplot(mpg, aes(drv, displ, fill = drv)) +
  geom_violin(show.legend = FALSE) +
  stat_summary(
    fun.data = ~ data.frame(
      mean = mean(.x), 
      sd   = sd(.x), 
      max  = max(.x)
    ),
    aes(
      y = stage(displ, after_stat = max + 0.4),
      label = after_stat(sprintf("%.2f±%.2f", mean, sd))
    ),
    geom = "text"
  )

Caveat

Staging function on their own are inert.

after_stat(10)
## [1] 10

after_scale(10)
## [1] 10

stage(10, "A", mpg)
## [1] 10

They need to be put in aes().

aes(
  a = after_stat(10),
  b = after_scale(10),
  c = stage(10, "A", mpg)
)
## Aesthetic mapping: 
## * `a` -> `after_stat(10)`
## * `b` -> `after_scale(10)`
## * `c` -> `stage(10, "A", mpg)`

Delayed evaluation: summary

after_stat() to access computed variables.
after_scale() to redirect mapped values.
stage() to initiate and delay modification.

Exercise 2.3

03:00

Polar coordinates

The classic coord_polar() is superseded by coord_radial().

expand parameter
Arbitrary sectors
Donuts

Polar coordinates

Helpful to always examine plot in Cartesian coordinates.

Code

p <- ggplot(mpg, aes(y = factor(1), fill = factor(drv))) +
  geom_bar() +
  # Add labels
  stat_count(
    aes(label = after_stat(paste0(fill, " =\n", count))),
    geom = "text",
    position = position_stack(vjust = 0.5)
  ) +
  # Turn off y-axis and legend
  scale_y_discrete(guide = "none", name = NULL) +
  scale_fill_discrete(guide = "none")
p

Polar versus radial

Differences between coord_polar() and coord_radial().

polar <- p + 
  coord_polar() + 
  labs(title = "coord_polar()")
radial <- p + 
  coord_radial() + 
  labs(title = "coord_radial()")

# Thomas will explain this here mystery later
polar + radial

Polar versus radial

Set expand = FALSE for use in pie charts.

polar <- p + 
  coord_polar() + 
  labs(title = "coord_polar()")
radial <- p + 
  coord_radial(expand = FALSE) + 
  labs(title = "coord_radial()")

polar + radial

Polar axes

coord_radial() interfaces with guide system mostly via guide_axis_theta(). Also note the text angles.

red_axis <- scale_x_continuous(
  guide = guide_axis_theta(
    angle = 0, 
    theme = theme_gray(ink = "red")
  )
)

# Ignores guide
(polar + red_axis) + 
  # Uses correct guide
  (radial + red_axis)

Partial circles

We’re no longer restricted to complete circles.

p + coord_radial(start = -0.4 * pi, end = 0.4 * pi)

Partial circles

Switching a pie chart to a donut chart is as easy as setting the inner.radius argument.

p + coord_radial(
  expand = FALSE, 
  inner.radius = 0.5
)

Partial circles

We can combine partial polar coordinates with donuts.

p + coord_radial(
  start = 0, end = 0.5 * pi, 
  inner.radius = 0.5
)

Polar coordinates: summary

coord_radial() replaces coord_polar()
Partial circles: start & end
Donut: inner.radius

Exercise 2.4

03:00

Display of inner axes

p <- ggplot(penguins) +
  aes(bill_len, bill_dep, colour = sex) +
  geom_point(na.rm = TRUE)
p + facet_grid(island ~ species)

Display of inner axes

Inner axes can be exposed, for all directions or x or y individually.

p + facet_grid(island ~ species, axes = "all")

Display of inner axes

We can confine labels, so inner axes only display tick marks.

p + facet_grid(island ~ species, axes = "all", axis.labels = "margins")

Layout

Layers have a layout argument that can be interpreted by facets.

p +
  geom_point(
    colour ="grey", shape = 1, 
    na.rm = TRUE,
    layout = "fixed"
  ) +
  geom_point(na.rm = TRUE) +
  facet_grid(island ~ species)

Layout

facet_wrap() and facet_grid() allow placement at certain panels.

p +
  annotate(
    geom = "text", x = I(0.7), y = I(0.25), size = 2,
    label = "Adelie Penguins\non Dream island",
    layout = 4
  ) +
  facet_grid(island ~ species)

Wrap panel order

New panel ordering settings in dir argument.

as.table is now absorbed in dir
Use two-letter combination of t, r, b, l
- t = top
- r = right
- b = bottom
- l = left
Combinations determines starting point, e.g. "br" starts in the bottom-right.
First letter indicates growing direction, e.g. "br" grows bottom-to-top before right-to-left.

Wrap panel order

The default order is "lt".

panels <- ~ as.integer(interaction(species, island, drop = TRUE))
p + facet_wrap(panels, dir = "lt")

Wrap panel order

p + facet_wrap(panels, dir = "tr")

Facets: summary

Display of inner axes
- axes = "margins"/"all"/"all_x"/"all_y"
- axis.labels = "all"/"margins"/"all_x"/"all_y"
layer(layout) argument
- Repeat data across panels
- Confine data to individual panels
facet_wrap(dir) sets panel layout
- Two letter code determine start position
- First letter determines growing direction

Exercise 2.5

03:00

Next session: Text rendering and font use