(5) R Packages, Part I
— LAST YEAR’S CONTENT BELOW —
26.14 Learning Objectives
This tutorial aims to get you started with package development in R. By the end of this tutorial, you’ll have the beginnings of an R package called powers
(complete version). You’ll learn about key components of an R package, and how to modify them.
We’ll be going over the following topics:
- set up the directory structure for a package and put it under version control with
File
->New Project
- define functions in R scripts located in the
R
directory of the package - use
load_all
andBuild & Reload
to simulate loading the package - use
Check
to check the package for coherence - use
Build & Reload
to properly build and install the package - edit the
DESCRIPTION
file of package metadata - specify a LICENSE
- document and export the functions via
roxygen2
comments - document the package itself via
use_package_doc()
- create documentation and manage the
NAMESPACE
file viadocument()
- use
testthat
to implement unit testing - use a function from another package via
use_package()
and syntax likeotherpkg::foofunction()
- connect your local Git repo to a new remote on GitHub via
use_github()
- create a
README.md
that comes from renderingREADME.Rmd
containing actual usage, viause_readme_rmd()
- create a vignette via
use_vignette()
and build it viabuild_vignettes()
26.15 Participation
We’ll be developing the powers
R package in class. Please follow along with this, developing in your participation repo.
At least, some of the development. Sometimes it might be better to just sit back and watch. I’ll try to inform you when to do what.
26.16 Resources
This tutorial is adapted from Jenny Bryan’s STAT 547 tutorial, where she develops the foofactors
package.
Other resources you might find useful:
- Hadley’s “R Packages” book.
- Concise. Works with
devtools
and friends.
- Concise. Works with
- Package development cheatsheet
- “Writing R Extensions”, the official guide to writing R packages.
- Comprehensive. Doesn’t refer to
devtools
and friends.
- Comprehensive. Doesn’t refer to
Others on specific topics:
During exercise periods, in case you’re ahead of the class and have time, you should work on Homework 7.
26.17 Motivation
Why make a package in R? Here are just a few big reasons:
- Built-in checks that your functions are working and are sensible.
- Easy way to store and load your data – data packages like
gapminder
are awesome! - Allows for documentation of functions that you’ve written.
- Companion for a journal article you’re writing.
Think aid for a type of analysis, not an analysis itself.
And an R package does not need to be big!
26.18 Getting Started
Install/update the devtools
package, used as an aid in package development:
install.packages("devtools")
This will do for now – for development beyond the basics, you might need to further configure your computer.
26.19 Let’s start with a single function
26.19.1 Function creation
Follow along as we make an R package called powers
that contains a function square
that squares its input. Let’s initiate it:
- RStudio —> New project —> R Package
- Initiate git (optional, but recommended).
- Under the “Build” menu, click “Install and Restart”
- Check out the files that have been created
- Rd
- NAMESPACE
- DESCRIPTION
Now, start a new R script in the R
directory, called square.R
. Write a function called square
that squares its input.
Build the package:
Build and Reload
, or in newer versions of RStudio,Install and Restart
.- This compiles the package, and loads it.
- Try leaving the project, do
library(powers)
, and use the function! Pretty cool, eh?
26.19.2 Documentation
The roxygen2
package makes documentation easy(er). Comment package functions with #'
above the function, and use tags starting with @
. Let’s document the square
function.
Key tags:
@param
– what’s the input?@return
– what’s the output?@export
– make the function available upon loading the package.
Type document()
into the console (a function from the devtools
package). Then Install and Restart
the package.
Your function is now documented. Check it out with ?square
! This happens due to the creation of an Rd
file in the man
folder.
26.19.3 Taking control of your NAMESPACE
Let’s start being intentional as to what appears in our NAMESPACE.
- Delete your NAMESPACE file.
- Add the
@export
tag to yoursquare
function to write it to the NAMESPACE.
Things that do not get @export
ed can still be referred to “internally” by functions in your NAMESPACE, as we’ll see soon.
26.19.4 Checking
It’s a good idea to check
your package early and often to see that everything is working.
Click Check
under the Build
menu. It checks lots of things for you! We’ll see more examples of this.
26.19.5 Function Dependencies
Make another, more general function to compute any power:
It can go in the same R script as square
, or a different one – your choice.
We’ll make square
depend on pow
.
Aftering Install and Restart
ing, you’ll notice that you can’t use pow
because it’s not export
ed. But, square
still works! We call pow
an internal function.
Note: you should still document your internal function! But mention that the function is internal. Users will be able to access the documentation like normal, but still won’t be able to (easily) use the function.
If you want to be able to use internal functions as a developer, but don’t want users to have (easy) access to the functions, then run load_all
instead of Install and Restart
.
26.19.6 Your Turn
Make and document another function, say cube
, that raises a vector to the power of 3. Be sure to @export
it to the NAMESPACE. Use our internal pow
function to make cube
, if you have it.
Finished early? Do more – work on Assignment 7, and/or try out more documentation features that comes with roxygen2
(the @
tags).
26.20 Documentation and Testing
26.20.1 More Roxygen2 Documentation
\code{}
for code font\link{}
to link to other function docs- Combine:
\code{\link{function_name}}
Enumeration:
#' \enumerate{
#' \item first item
#' \item second item
#' }
Itemization:
#' \itemize{
#' \item first item
#' \item second item
#' }
Manually labelled list:
#' \describe{
#' \item{bullet label 1}{first item}
#' \item{bullet label 2}{second item}
#' }
26.20.2 DESCRIPTION file
Every R package has this. It contains the package’s metadata. Let’s edit it:
- Add a title and brief description.
- R is picky about these! Check out the rules.
- Add your name.
- Use the
Authors@R
field instead of the defaultAuthor
andMaintainer
fields.
- Use the
- Pick a license: next!
26.20.4 Testing with testthat
We’ve already seen package Check
s – this checks that the pieces of your R package are in place, and that even your examples don’t throw errors. We should not only check that our functions are working, but that they give us results that we’d expect.
The testthat
package is useful for this. Initialize it in your R package by running use_testthat()
.
As a template, save and edit the following script in a file called test_square
in the tests/testthat
folder, filling in the blanks with an expect
statement:
context("Squaring non-numerics")
test_that("At least numeric values work.", {
num_vec <- c(0, -4.6, 3.4)
expect_identical(square(numeric(0)), numeric(0))
FILL_THIS_IN
})
test_that("Logicals automatically convert to numeric.", {
logic_vec <- c(TRUE, TRUE, FALSE)
FILL_THIS_IN
})
Then, you can execute those tests by running devtools::test()
, or clicking Build
-> Test package
.
These sanity checks are very important as your R package becomes more complex!
26.21 Higher-level User Documentation
26.21.1 Package Documentation
Just like we do for functions, we can make a manual (.Rd
) page for our entire R package, too. For example, check out the documentation for ggplot2
:
?ggplot2 # Can execute only if `ggplot2` is loaded.
package?ggplot2 # Always works.
To do so, just execute use_package_doc()
. You’ll see a new R script come up with roxygen2
-style documentation to NULL
. Document as you’d do functions, and run document()
to generate the .Rd
file.
Here’s sample documentation:
#' Convenient Computation of Powers
#'
#' Are you tired of using the power operator, \code{^} or \code{**} in R?
#' Use this package to call functions that apply common powers
#' to your vectors.
#'
#' @name powers
#' @author Me
#' @note This package isn't actually meant to be serious. It's just for
#' teaching purposes.
#' @docType package
26.21.2 Vignettes
It’s a good idea to write a vignette (or several) for your R package to show how the package is useful as a whole. Documentation for individual functions don’t suffice for this purpose!
To write a vignette called "my_vignette"
, just run
use_vignette("my_vignette")
Some things happen automatically, but it’s up to you to modify the .Rmd
document to provide adequate instruction. Change the template to suit your package. The only real “catch” to doing this is making sure the title is replaced in both instances.
Then just Knit
, and then run build_vignettes()
to build the vignettes.
Vignette woes: There seems to be resistance against building vignettes when installing. Try running install(build_vignettes=TRUE)
to get it working.
26.21.3 README
Just as most projects should have a README
file in the main directory, so should an R package.
Purposes:
- Inform someone stumbling across your project what they’ve stumbled across.
- At a high level (like “This is an R package”), but also
- somewhat at a lower level too, like your description file. This becomes a little redundant.
- I like to use the README to inform developers the main workflow and spirit behind developing the package.
- There are some things that you’d want other potential developers to know about the package as a whole, yet are irrelevant to users!
How to do it:
You could just make and edit a README.md
file like normal. But you’ll probably want to briefly demonstrate some code, so you’ll need an .Rmd
. Let devtools
set that up for you:
use_readme_rmd()
knit
and you’re done!
26.21.4 Exercises
Create the above three types of documentation, without looking at my version. Then compare.
Ideally, you’ll have more to document because you’ve been working on expanding this (or another) R package for Homework 07 already.
26.22 Adding data to your R package
You can store and document datasets within R packages. Here’s one useful way.
Note: This currently doesn’t seem to be present in the companion tutorial from Jenny. Check out the R Packages “data chapter” for a resource.
Example:
Let’s add tenvec
and tendf
to the package:
tenvec <- 1:10
tendf <- data.frame(vec=1:10)
In the console:
- Store your data as R objects, as we’ve done above with
tenvec
andtendf
. - Execute
use_data(tenvec, tendf)
(one argument per object).
tenvec
and tendf
will be saved as .Rdata
files in the new /data
directory. These are available upon loading the package.
To document the data, for each object (i.e., for each of tenvec
and tendf
), put roxygen2
-style documentation above the character "tenvec"
and "tendf"
in an R script in the /R
folder.
Example for tenvec
:
#' Integer vector from 1 to 10
#'
#' Self-explanatory!
#'
#' @format What format does you data take? Integer vector.
#' @source Where did the data come from?
"tenvec"
The @format
and @source
tags are unique to data documentation. Note that you shouldn’t use the @export
tag when documenting data!
26.23 Dependencies
We can use functions from other R packages within our homemade R package, too. We need to do two things:
- Use the syntax
package_name::function_name()
whenever you want to usefunction_name
frompackage_name
. - Indicate that your R package depends on
package_name
in the DESCRIPTION file by executing the commanduse_package("package_name")
.
There are other methods, but this is the easiest.
Example: Add ggplot2
dependency to plot the resulting computations. Do so by adding a plot to pow
– change pow
’s guts to the following:
res <- x^p
if (showplot) {
p <- ggplot2::qplot(x, res)
print(p)
}
res
Note 1: Here’s an example of the benefits of not having your functions do too much – I only needed to change pow
alone to get the changes to work for square
and cube
.
Note 2: It’s probably better to use Base R’s plotting here, so that your package is as stand-alone as possible. We use ggplot2
for expository purposes.
26.24 Launching your Package to GitHub
If I want to put an R package on GitHub, I typically just:
- Click “New” in GitHub to make a new repo. Don’t initialize with README.
- Follow the instructions github provides, which involves two lines to execute in the terminal.
- Those two lines can be found here in Jenny’s Happy git book.
There is also the use_github()
way – although, to me, it seems overly complicated (perhaps there’s an advantage I don’t know about). It’s just a matter of following the instructions, which are not worth demonstrating here.
26.25 Time remaining?
If there’s time remaining, we’ll check out S3 OO programming in R.
- Add a “class” to the output of
pow
. - Add some methods:
print.pow <- function(x) {
cat(paste("Object of class 'pow',", head(x)))
invisible()
}
#' @export
bind.pow <- function(x) paste(x, collapse=".")
bind <- function(x) UseMethod("bind")