A crash course in Bayesian mixed models with brms (Lesson 5)
Introduction and review
What is this class?
Part 5 of a practical introduction to fitting Bayesian multilevel models in R and Stan
Uses the brms package (Bayesian Regression Models using Stan)
How to follow the course
Slides and text version of lessons are online
Fill in code in the worksheet (replace ... with code)
You can always copy and paste code from text version of lesson if you fall behind
What we’ve learned in Lessons 1-4
Relationship between prior, likelihood, and posterior
How to sample from the posterior with Hamiltonian Monte Carlo
How to use posterior distributions from our models to make predictions and test hypotheses
GLMMs with non-normal response distributions and random intercepts and slopes
Models with residuals correlated in space and time
Spatial generalized additive mixed models
Beta regression for proportion data
Zero-inflated and hurdle models for data with zeros
What will we learn in this lesson?
Cumulative logistic mixed models for ordered categorical data
Nonlinear mixed models
Big-picture concepts
Explore in more detail how link functions work in mixed models
Discover how every parameter in a mixed model can have fixed and random effects
Part 1. Cumulative logistic mixed models for ordinal data
Load packages and set options
library(brms)library(agridat)library(tidyverse)library(emmeans)library(tidybayes)library(ggdist)library(bayestestR)library(gt)theme_set(theme_bw())colorblind_friendly <-scale_color_manual(values =unname(palette.colors(5)[-1]), aesthetics =c('colour', 'fill'))# IF YOU ARE RUNNING THIS CODE ON YOUR OWN MACHINEoptions(mc.cores =4, brms.file_refit ='on_change')# IF YOU ARE RUNNING THIS CODE ON THE POSIT CLOUD SERVER (or you have configured CmdStan on your own machine)options(brms.backend ='cmdstanr', mc.cores =4, brms.file_refit ='on_change')
Categorical data
Image (c) Allison Horst
We’ve seen binary response data in lesson 2
Nominal data (for example: blood group) has statistical models but we won’t cover them today
Ordered categorical data
Lots of data in ag science are on an ordinal rating scale
Common in phenotyping: capture a complex phenotype, or a quantitative one that is too time-consuming to measure directly, with a simple score
Rate disease severity on a scale from 1 to 5
Instead of counting hairs on a stem, rate low/medium/high
Image (c) Univ. Groningen
Shortcut at your own risk
It is faster to score phenotypes on a rating scale than to physically measure the underlying variable directly
But this speed has a price: more data points are required for similarly precise estimates, than if you had a continuous variable
Some people try to have it both ways and treat the numerical ratings as continuous variables with normally distributed error
This “quick and dirty” shortcut often works OK
But why do that when you have statistical models for ordinal response data?!?
Review: modeling binary data
For binary data we use a logistic model: binomial response distribution with logit link function
We’re trying to predict a probability, which has to be between 0 and 1
But our linear predictor is a combination of fixed and random terms, multiplied by coefficients and added together, which can be any value, positive or negative
Logit, or log-odds, link function takes a probability between 0 and 1 and maps it to a scale that can have any value, positive or negative
\(\text{logit } p = \log \frac{p}{1-p}\)
Binomial distribution for binary data
Image (c) ICMA Photos
Logit model is modeling a binary outcome, we only need to predict single probability
Probability of success + probability of failure = 1
If we know one, we know the other
Model this process as following a binomial distribution which only has one parameter
Example: model coin flips as following a binomial distribution with one parameter: \(\theta\), the probability of flipping heads
Multinomial distribution, from coins to dice
Image (c) Diacritica
If we have \(n\) ordered categories, we have to estimate \(n - 1\) probabilities
They all have to be between 0 and 1
Instead of a binomial distribution, use a multinomial distribution.
\(n - 1\) parameters, the probability of each outcome except for the last
You can model a dice roll as following a multinomial distribution with equal probability, 1/6, of the first five outcomes
If we know probability of rolling 1, 2, 3, 4, and 5, we know the probability of rolling a 6
Cumulative logit link
We still use a logit link function but it’s a bit more complicated
Example: we have four categories, 1, 2, 3, 4
Estimate probability of category 1, probability of 2 or less, and probability of 3 or less
(Probability of 4 or less = 1)
These are cumulative probabilities so we’ll fit a cumulative logit model
Link functions in a cumulative logit model
\(n - 1\) link functions, one for each category except the last
Map our cumulative probabilities for each category to the linear predictor scale
If probabilities are \(p_{y \le 1}\), \(p_{y \le 2}\), and \(p_{y \le 3}\), the link functions are: