This lesson picks up where Lesson 1 left off, continuing along with a practical introduction to fitting Bayesian multilevel models in R and Stan using the brms package (Bayesian Regression Models using Stan).
Download the worksheet for this lesson here.
We started out learning the basic ideas behind Bayes.
We learned how to harness the power of the Hamiltonian Monte Carlo method to fit Bayesian models, using the Stan language via the R package brms. Specifically, we learned how to:
Of course, we only scratched the surface of those complex topics. But if you don’t remember any of that, now would be a good time to review Lesson 1.
After teaching Lesson 1, I got feedback from the participants. People asked me to teach them how to fit mixed models that correspond to common study designs in agricultural science. I also was asked to take a little deeper dive into prior distributions and show how specifying different priors can impact the conclusions we come to. I designed Lessons 2 and 3 based on that feedback. In Lesson 2, you are going to learn how to:
You will learn some more practical skills for fitting Bayesian models and talking about the results in papers or presentations. We’ve seen these packages already in Lesson 1 but we will keep practicing them. In this lesson we will:
Load all the packages and set a plotting theme. Also define a color
scale that is colorblind-friendly. Create a directory called
fits
in your current working directory to store fitted
model objects, if it does not already exist.
library(brms)
library(agridat)
library(tidyverse)
library(emmeans)
library(tidybayes)
library(ggdist)
library(bayestestR)
library(gt)
theme_set(theme_bw())
colorblind_friendly <- scale_color_manual(values = unname(palette.colors(5)[-1]), aesthetics = c('colour', 'fill'))
if (!dir.exists('fits')) dir.create('fits')
Set brms options. The option
mc.cores = 4
means we will use four processor cores in
parallel to fit the models, meaning we can allow four Markov chains to
sample simultaneously. The options
brms.file_refit = 'on_change'
means that if a model has
already been fit and saved to a file, and the code and data are
unchanged, the model will be reloaded from the file instead of having to
fit it again. If you are running this lesson on the cloud server, there
are some pre-fit model objects to save time during the workshop.
options(mc.cores = 4, brms.file_refit = 'on_change')
If you are running this code on the Posit Cloud server, set these options instead of the ones above. This will use the faster and less memory-intensive cmdstanr package to fit the models behind the scenes. Unfortunately, the CmdStan software that cmdstanr uses is difficult to install on USDA laptops, but it can be done. Check out instructions for installing CmdStan here, and if those don’t work, instructions for a workaround to install CmdStan using Anaconda. If you end up wanting to run Bayesian models on your own data after this workshop, another option is to run brms code on the SciNet cluster, where the cmdstanr package can easily be configured.
options(mc.cores = 4, brms.file_refit = 'on_change', brms.backend = 'cmdstanr')
Before we dive into more complex models, let’s review Bayesian statistics a little bit. Don’t feel bad if it takes you a few times to really grasp the concepts. Most people need a lot of repetition before these ideas become second nature.
Let’s start by going over Bayes’ theorem again.
\[P(A|B) = \frac{P(B|A)P(A)}{P(B)}\]
This means the probability of A being true given that B is true (\(P(A|B)\)) is equal to the probability that B is true given that A is true (\(P(B|A)\)) times the ratio of probabilities that A and B are true (\(\frac{P(A)}{P(B)}\)). Using Bayes’ theorem in the context of fitting a statistical model to data, we can state this as:
\[P(model|data) = \frac {P(data|model)P(model)}{P(data)}\]
or
\[posterior \propto likelihood \times prior\]
What does this mean? It means that we have some prior knowledge about the world. We formally state that knowledge by giving a statistical distribution of what we think the parameters of the statistical model describing the world are without having collected any data yet. This is called the prior, which in Bayes’ theorem is represented by \(P(model)\) because it’s the probability of the model not dependent on the data. Then, we go out and collect some data. We use the data to calculate \(P(data|model)\) which we call the likelihood. That’s a function that just depends on the data. It represents how likely it was to observe the data we did, given all possible combinations of parameters we could have in our model. We have to divide the likelihood by the marginal probability or \(P(data)\) which is basically just some math to make the likelihood function have an area under the curve of 1. (In a probability distribution, adding up the probabilities of all possible mutually exclusive outcomes together has to result in a sum of 1, or 100% probability.) Then, when we multiply the prior and likelihood together, we get \(P(model|data)\) which we call the posterior. That is our new knowledge of the world, after collecting the data, represented by a new statistical distribution for our parameter estimates. Because it is the product of the prior and the likelihood, it represents a combination of what we knew before we got the data (the prior), updated by the new information we got from the data (the likelihood).