3.2 RMarkdown for Reproducible Reports

Bootcamp R Open Science

 

As researchers we often want our collaborators to easily understand what we have done during an analysis. We could just share our raw data and code as it should be reproducible now you’ve completed module 3.1. It would be much better to create a reproducible report using RMarkdown that presents your code alongside its output (graphs, tables, etc.) with conventional text to explain it. Of course, reproducible reports aren’t just a useful way of sharing analyses with collaborators but can also act as a record for yourself.

 

Contents

3.2.1 What is RMarkdown?

3.2.2 Installing RMarkdown

3.2.3 Creating an RMarkdown Document

3.2.4 YAML Header

3.2.5 Code Chunks and Plain Text

3.2.6 Plots

3.2.7 Tables

3.2.8 Markdown Basics

3.2.9 Useful Resources

 

 

3.2.1 What is RMarkdown?

Many programming languages used for data analysis have packages to generate reproducible reports - the most popular ones are Jupyter Notebooks for Python users and RMarkdown for R users. While both of these examples share many commonalities, their implementation and everyday applications differ.

 

RMarkdown is a file format (typically saved with the .Rmd extension) that can contain: a YAML header, text, code chunks, and inline code. The rmarkdown package converts this file into a report, most commonly into HTML or PDF. You don’t need to understand the multi-step process that rmarkdown uses to generate report outputs but for those interested please see here.

 

 

3.2.2 Installing RMarkdown

To use RMarkdown in RStudio you first need to install and load the rmarkdown package by running:

install.packages("rmarkdown")  # Installs package
library(rmarkdown)  # Loads package

 

 

3.2.3 Creating an RMarkdown Document

Now that you have rmarkdown installed let’s create a reproducible report! The first thing we need to do is create an RMarkdown file in RStudio:

  1. Click File
  2. Click New File
  3. Click R Markdown

The following window will now appear:

  1. Change the ‘Title’ to ‘My First Reproducible Report’ and the Author to your name
  2. Click OK

You should now have a file that looks like this:

As you can see, newly created RMarkdown files come with some basic instructions and examples but we are not interested in these so select and delete them before saving your document. We can now populate our RMarkdown file with our own analyses and information.

 

 

3.2.4 YAML Header

YAML headers lives at the top of your RMarkdown file and are delineated by three dashes (—) at the top and at the bottom of it:

---
title: "My First Reproducible Report"
author: "Joe Roberts"
date: "15/06/2020"
output: html_document
---

YAML headers are optional but when you create a new RMarkdown file the title, author, date and output format are printed at the top of your document by default. Although this section is optional it is usually a good idea to include a YAML header as they can be used to specify:

Using the above code as an example, add a YAML header containing your title, author, date and output information to the RMarkdown file we just created:

Now click knit to compile the .Rmd into a .html file (red box in the above image). RStudio will open a separate preview window to display the output of your .Rmd file and also save a .html file in the same folder as your .Rmd file. Your preview should look like this:

Congratulations - you’ve just compiled your first RMarkdown file! Below the YAML header is where we write code and text as well as displaying any outputs (e.g. plots, tables, etc.). Let’s take a closer look at these aspects of RMarkdown.

 

 

3.2.5 Code Chunks and Plain Text

Code chunks can be interspersed within the text of the report and are delineated by three backticks at the top and at the bottom. Top backticks are followed by curly brackets that specify the:

Everything after the chunk name HAS to be a valid R expression! RMarkdown doesn’t pay attention to anything loaded in other R scripts or the global environment, you MUST load all objects and packages in the RMarkdown script.

```{r example-code-chunk, include = TRUE}
# This is an example code chunk
```

The above code chunk is specifies R as the programming language, that the chunk is called example-code-chunk and that it should be included in the RMarkdown .html document. Let’s add some plain text and a code chunk to our own RMarkdown file. Copy the following into your RMarkdown file and knit again:

This is our first code chunk: 
```{r our-first-code-chunk, include = TRUE}
# Load the cars data into a new object
car_data <- cars

# Examine first five rows of this object
head(car_data)
```

There are lots of different knitr options availble to you to customise the behaviour of your code chunks. A summary of key options can be found here.

 

 

3.2.6 Plots

Inserting plots into an RMarkdown file is very easy! The more challenging aspect is adjusting their format using knitr options. First, insert a second code chunk into your RMarkdown file called example-plot-chunk and then create a plot by adding the following code and running knit:

# Create a plot of the car data
plot(car_data)

RMarkdown maximises plot height while keeping them within the margins of the page and maintaining aspect ratio. To manually set the figure dimensions, you can insert an instruction into the curly brackets:

```{r example-plot-chunk, fig.width = 10, fig.height = 5, include = TRUE}
plot(car_data)
```

Hopefully you now have two code chunks in your RMarkdown file.

 

 

3.2.7 Tables

R Markdown displays data frames and matrixes as they would be in the R terminal (in a monospaced font). If you prefer that data be displayed with additional formatting you can use the knitr::kable function:

# Create a dataframe
A <- c("a", "a", "b", "b")
B <- c(5, 10, 15, 20)
dataframe <- data.frame(A, B)

# Format using `kable` package
library(knitr)
kable(dataframe, digits = 2)

Insert the above code into a third data chunk in your RMarkdown file and knit. There are many alternative ways of creating tables in RMarkdown, which you can investigate yourself.

 

 

3.2.8 Markdown Basics

You can format the text in your RMarkdown file with Pandoc’s Markdown, a set of markup annotations for plain text files. When you compile your file, Pandoc transforms the marked up text into formatted text in your final file format. Markdown formatting examples:

A comprehensive reference guide to Markdown formatting can be found here.

 

 

3.2.9 Useful Resources