# Code here
ggplot2-exercises
Introduction
Creating your own graphs using ggplot2
is much more motivating than watching some slides. Time to get started yourself with your recently acquired ggplot2
-skills!
Below are exercises for each part of the slides. Each time you will find an empty code block in which you can insert the necessary R code (as below).
You can check whether your code is working by clicking on the green arrow at the top right of a code block. This executes the code you wrote.
It works? Great!
Does it not work? You can also take a look in the solution key Exercises_ggplot2_solution.Rmd
or Exercises_ggplot2_solution.html
!
Time to get started!
The data
Obviously, we need to start by getting our data ready and loading it into R. For this exercise, we will play a little further with the penguins
data from the palmerpenguins
package .
Execute the code below to load and activate the package.
💡 Don’t forget to install this package first (if you haven’t done so already)
library(palmerpenguins)
data("penguins")
Load some additional packages
Throughout the exercises, we will use two additional packages:
- tidyverse
: to manipulate data
- patchwork
: to combine plots
💡 Don’t forget to install these packages first (if you haven’t done so already)
library(tidyverse)
library(patchwork)
Part 1: Visualising categorical variables
1.1 Basic barplot
The penguins
data contains a variable year
that indicates when each penguin was observed. Create a simple barplot of the variable year
.
💡 Check how the variable is defined in R. To create a barplot, a factor
or character
variable works better than a numeric
variable.
# Code here
1.2 Add colors, titles and labels
Now that you have created your first barplot, it is time to add some spice to the plot! Change the colours of the bars to colours of your own choosing. Also, add meaningful titles and think about labels for the axes.
💡 This and the other assignments contain many degrees of freedom for you as a designer of the barplot. The solutions in this response file are just one of many possibilities.
# Code here
1.3 Change theme
Your barplot is almost ready! Now you can play around with the theme
of the plot.
There are a number of default themes
in ggplot2
:
- theme_grey()
- theme_bw()
- theme_linedraw()
- theme_light()
- theme_dark()
- theme_minimal()
- theme_classic()
- theme_void()
- theme_test()
Reuse the code you made and make sure that it is saved as an object with the name P1
.
Then, print P1
and add another theme. This code should look like this: P1 + theme_minimal()
Play around with the themes and see how the result differs!
# Code here
1.4 Omit legend
A final step in finishing the barplot is to omit the legend. (After all, the legend is redundant and might as well be removed.)
# Code here
1.5 Lollipop plot
Rework the barplot to a lollipop plot. Reuse as much of the code you wrote as possible!
# Code here
Part 2: Visualising quantitative variables
2.1 Basic histogram
The penguins data contains measurements of bill length (bill_length_mm
) and bill depth (bill_depth_mm
). Both variables are expressed in millimetres.
Create a histogram of the distribution of the variable bill_length_mm
. Make sure we can distinguish the three penguin species
(by using a fill color).
# Code here
2.2 Add facets
Next, create a histogram for each of the three species using facet_wrap(~species)
. Make sure that male and female penguins can be visually distinguished (using appropriate colors). At the same time, provide a useful title and subtitle and label the scales (using self-selected labels).
💡 Attention! The gender of some penguins has not been recorded. You have to filter these observations BEFORE creating the histogram. A piece of code has already been included to do so. That code now has a #
in front of it. Remove that #
so the code will be executed.
# Code here
2.4 Rain cloud plot
Finally, create a rain cloud plot that shows the differences between the three penguin species. You can ignore (for now) the difference between male and female penguins.
💡 Attention! To create a rain cloud plot, an additional package should be loaded (and installed): ggdist
.
# Code here
2.5 Add colors to rain cloud plot
A final challenge is adding color to the rain cloud plot which indicates the sex
of the penguins. Do this for the layer that defines the density (stat_halfeye()
) and the layer that defines the points (geom_point()
).
💡 Attention! The gender of some penguins has not been recorded. You have to filter these observations BEFORE creating the histogram. A piece of code has already been included to do so. That code now has a #
in front of it. Remove that #
so the code will be executed.
# Code here
Part 3: Visualising more than one variable
3.1 Basic scatterplot
Create a simple scatterplot with bill length on the x-axis (bill_length_mm
) and bill depth (bill_depth_mm
) on the y-axis. Save this plot as an object with the name P1
and print it.
💡 Take a closer look at the scatterplot. What does it tell about the relationship between both variables?
# Code here
3.2 Add linear trend line
In 3.1 you created a scatterplot of the relationship between bill length and bill depth . (You saved the plot as the object P1
.) Add a linear trend line to this scatterplot using geom_smooth(method = "lm")
. Save that plot as the object P2
and print it.
# Code here
3.3 Add colors
Explore the data in more detail by adding colors to the points and the trend line that depend on the variable species
. To do this, restart coding (do not continue building on P1
and P2
).
Apply a nice(r) theme to this new plot and add appropriate labels and titles to the scatterplot.
# Code here
3.4 Grouped barplot (counts)
Create a barplot that represents the relation between species
and sex
. The variable species
should be mapped to the x-axis. Make sure that the bars are stacked upon one another and that counts are shown. Specify the “fill”-colors of the bars by using scale_fill_brewer(type = "qual", palette = 1)
. Play around with the palettes and pick the one you like the most.
💡 Attention! The gender of some penguins has not been recorded. You have to filter these observations BEFORE creating the histogram. A piece of code has already been included to do so. That code now has a #
in front of it. Remove that #
so the code will be executed.
# Code here
3.5 Grouped barplot (percentages)
Reuse the code you wrote in 3.4 to create the same barplot that shows percentages (instead of counts). Style the bar plot using all the ggplot-functions and arguments you learned. If your barplot is finished, save it as the object (my_barplot
).
# Code here
3.6 Save your plot!
Often you want to save your plot (to use it in an article or a presentation). Saving a plot is easy in ggplot.
Execute the code you find in the box below to save your plot. (First remove all the #
!) To find your saved plot, go the the folder ‘Saved_figures’ (sub folder of ‘Exercise3_ggplot2’).
#ggsave(my_barplot,
# file = "Saved_figures/my_barplot.png",
# width = 16,
# height = 12,
# units = "cm",
# bg = "white") # color of background
Final visualisation challenge
Does this figure looks familiar to you?
Try to recreate it and make it more informative… (Think about colors, labels of the values on the x-axis, …) Don’t forget to first import the ‘Friends data’.
# Code here
If you like this challenge, have a look at #tidytuesday project. This is a project of the R community that releases a raw data set each Tuesday. This data set can be used to practise data wrangling and visualisation skills. Many participants share the results of their coding fun on social media and/or Github (the code itself).