Getting Bayesian model software to work in R

NOTE FOR USDA EMPLOYEES: Currently (February 2024), it is not possible for users of USDA laptops to configure brms to work with CmdStanR. Please follow the instructions below, skipping the CmdStanR section.

My preferred setup for Bayesian models in R is to use the brms modeling package. brms is a R interface that allows you to write multilevel models with simple code that has similar syntax to the common mixed-model package lme4. “Behind the scenes” it fits a Bayesian model with the software Stan that has a state-of-the-art algorithm coded in C++ for quickly and efficiently sampling posterior distributions. There are two different packages that integrate between Stan and R, and you can specify which one to use in brms. While the default is rstan, the best option which makes the models run the fastest is CmdStanR. So in order to get brms working in R, you need to not only install brms but also have Stan installed on your system, and optionally CmdStanR. Then you have to set up everything so that all those pieces of software can communicate with each other.

I want to give props to Paul Bürkner, the developer of brms, and all the developers of Stan including Andrew Gelman, Bob Carpenter, and lots of other people. Their hard work makes our life easier and our stats better!

Installing brms, Stan, and Rtools

Note: these instructions are intended for Windows users with R 4.3.x installed. If you have an older version of R, please update to the latest version before installing CmdStan.

Run the following code in your RStudio console. First you have to install the brms package and all its dependencies from CRAN. In addition, you will need to install the devtools package which will be required “behind the scenes” for compiling the models.

install.packages(c('brms', 'devtools'))

You will also need to install Rtools on your computer. This is a piece of software necessary for building the models on a Windows system. First, check whether Rtools is installed by running this code:

devtools::find_rtools()

If Rtools is already installed, you will see TRUE and you can go to the next step. If it is not installed, quit RStudio, go to the Rtools installation page on CRAN, and download the Rtools43 installer using the link (it’s a very large file, several hundred MB). Run the installer, and restart RStudio when the installation is complete.

Installing CmdStanR

If you are on a USDA laptop and you cannot set up your machine to work with CmdStanR, skip down to the Running brms section below to test whether brms and Stan were installed correctly. But if you want to set up CmdStanR as well, read on. This is a summary of the installation instructions on the CmdStanR homepage.

Install CmdStanR from its own repository:

install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

Now, because everything will be running in C++, you need to make sure your system is configured to compile C++ programs:

library(cmdstanr)
check_cmdstan_toolchain(fix = TRUE)

Once you have confirmed that the CmdStan toolchain is configured correctly, you can install Stan (more specifically, a version of it called CmdStan):

install_cmdstan()

All of the above only needs to be done once.

Running brms

Every time you run an R session with brms, you need to load the package with

library(brms)

Then I prefer to set the following options:

options(mc.cores = 4, brms.backend = 'cmdstanr', brms.file_refit = 'on_change')

This tells brms to run four chains in parallel (set this to a smaller number if your machine will blow up if you try to run 4 cores in parallel), ensures that cmdstanr is being used to fit the models, and also allows you to load pre-fit models from a file if you call the same model again which will save you time!

If you do not have CmdStanR installed, set the options without specifying the backend: options(mc.cores = 4, brms.file_refit = 'on_change)

Testing brms

You can run this line of code which will tell brms to fit a simple model on a built-in example dataset. If it gives you a message that it is Compiling Stan model..., then progress indicators for the sampler, and then model summary output, it means everything is working!

brm(mpg~hp, data = mtcars)