A crash course in Bayesian mixed models with brms (Lesson 2)

Introduction and review

What is this class?

  • Part 2 of a practical introduction to fitting Bayesian multilevel models in R and Stan
  • Uses the brms package (Bayesian Regression Models using Stan)

How to follow the course

  • Slides and text version of lessons are online
  • Fill in code in the worksheet (replace ... with code)
  • You can always copy and paste code from text version of lesson if you fall behind

Concepts we covered in Lesson 1

  • What is Bayesian inference
  • Prior, likelihood, and posterior
  • Credible intervals

Techniques we covered in Lesson 1

  • Use Hamiltonian Monte Carlo method, implemented in Stan/brms, to fit Bayesian models
  • Fit a multilevel (mixed) model with random intercepts and random slopes
  • Specify different priors
  • Diagnose and deal with convergence problems
  • Plot model parameters and predictions
  • Compare models with information criteria
  • Do hypothesis testing with Bayesian analogues of p-values

What will we learn in this lesson?

  • Specify strongly and weakly informative priors and see how they may influence your results
  • Fit a Bayesian generalized linear model (GLM), for situations where you can’t assume normally distributed error in your model
  • Fit a Bayesian generalized linear mixed model (GLMM), to accommodate non-normal data and a mixture of fixed and random effects

Packages for a “Bayesian data-to-doc pipeline”

  • emmeans: make predictions from Bayesian models
  • bayestestR: Bayesian hypothesis testing
  • tidybayes, ggdist, and ggplot2: make nice plots of Bayesian model predictions
  • gt: make nice tables of model predictions

Back to Bayes-ics: Bayes’ Theorem

\[P(A|B) = \frac{P(B|A)P(A)}{P(B)}\]

The probability of A being true given that B is true (\(P(A|B)\)) is equal to the probability that B is true given that A is true (\(P(B|A)\)) times the ratio of probabilities that A and B are true (\(\frac{P(A)}{P(B)}\))

Bayes’ theorem in a statistical context

\[P(model|data) = \frac {P(data|model)P(model)}{P(data)}\]

or

\[posterior \propto likelihood \times prior\]

Bayes’ theorem in statistics, explained

  • We have some prior knowledge about the world, formalized with a statistical distribution of what we think the parameters are without having collected any data yet
    • prior, represented by \(P(model)\), or the probability of the model not dependent on the data
  • We go out and collect some data and calculate \(P(data|model)\)
    • likelihood, a function that just depends on the data.
    • How likely it was to observe the data we did, given all possible combinations of parameters we could have in our model
  • Divide the likelihood, \(P(data|model)\), by the marginal probability, \(P(data)\), to make the likelihood function have an area under the curve of 1 and become a probability distribution
  • Multiply the prior and marginalized likelihood together to get \(P(model|data)\)
    • posterior, our new knowledge of the world, after collecting the data, represented by a new statistical distribution for our parameter estimates

The posterior distribution

  • The posterior is the product of the prior and the likelihood
  • It represents a combination of what we knew before we got the data (the prior), updated by the new information we got from the data (the likelihood)

Priors have a bad reputation

  • Both classical (frequentist) approach and Bayesian approach use likelihood
  • Only Bayesian approach uses priors
  • Priors scare some people because they think they are subjective and you can use them to get any result you want
  • It’s understandable to feel that way. What arguments can we give to defend the use of priors?

Argument 1: Assumptions are not a bad thing

  • Prior distributions are not the only assumption; likelihood always involves assumptions too!
    • Example: a general linear model, Bayesian or not, assumes x-y relationship is linear, and the noise around that is normally distributed
    • That’s a strong assumption that only roughly approximates the real world
    • If you won’t make assumptions, you can’t fit any statistical model at all!

Argument 2: Priors are (often) more practical than philosophical

  • If you have a large dataset, the prior distribution usually has relatively little influence on the results
  • The prior makes the computations needed to fit the statistical model a little bit easier
  • In those cases, only the Bayesian model can give us an answer in the first place

Argument 3: We should always check sensitivity of results to assumptions anyway!

  • If you make a major assumption in constructing your model, you should check whether the results are sensitive to it, priors or no priors
  • If the results don’t change as a result of changing the prior, then say so
  • If the results do change as a result of changing the prior, then present that and allow the reader to decide
  • If you know that your results are sensitive to choice of prior, it is wrong to present results from a single model with a single prior distribution as if they are the only “true” results

How the posterior is influenced by the prior and the data: example

Image (c) Washington State University
  • Example: We want to know the probability of a new vaccine successfully granting immunity to a disease.
  • Baseline immunity rate is 50%
  • Pessimistic/conservative prior assumption that the vaccine is likely to be ineffective at changing immunity rate, but it might have an effect
  • Assume a prior distribution with the most probable outcome being 50%, but allowing for outcomes both above and below that

Collect different amounts of data

  • Vaccinate animals and then challenge them with a virus and measure their immunity
  • Three different hypothetical scenarios
Number of animals tested Number immune Number not immune Observed immunity rate
Scenario 1 10 7 3 70%
Scenario 2 50 35 15 70%
Scenario 3 100 70 30 70%

Posterior distributions get closer to likelihood with increasing sample size

plots showing influence of likelihood on posterior
  • As the amount of data goes up, evidence that the true immunity rate is 70% becomes stronger
  • The influence of our prior distribution centered at 50% keeps getting weaker and weaker
  • The influence of the likelihood (informed by the data) keeps getting stronger
  • The more data you have, the more evidence you have, and the more the likelihood pulls the posterior towards itself and away from the prior

Hands-on example of prior distribution influence

  • If you have lots of data, you don’t have to worry that much about the influence of the prior
  • But what if you have a very small dataset?
  • Let’s do a hands-on example using data on a controversial topic where people may have strong prior beliefs

Setup

Load packages.

library(brms)
library(agridat)
library(tidyverse)
library(emmeans)
library(tidybayes)
library(ggdist)
library(bayestestR)
library(gt)
  • Set plotting options (including colorblind-friendly palette)
  • Create directory for fitted model objects if it does not exist
theme_set(theme_bw())
colorblind_friendly <- scale_color_manual(values = unname(palette.colors(5)[-1]), aesthetics = c('colour', 'fill'))

if (!dir.exists('fits')) dir.create('fits')
  • Set brms options
  • Cloud server has pre-fit model objects loaded to save time
options(mc.cores = 4, brms.file_refit = 'on_change', brms.backend = 'cmdstanr')
  • If you are on a local machine without CmdStanR, do not set brms.backend
options(mc.cores = 4, brms.file_refit = 'on_change')

Meet the example data

  • gathmann.bt example dataset from agridat package
  • Abundance of non-target arthropod species measured on transgenic Bt maize and control (isogenic line).
  • Type ?gathmann.bt in your console to pull up documentation for the dataset
  • Does Bt corn have bigger impacts on non-target species than conventional insecticide, or is it safer?

Image (c) Science 2.0
  • Small sample size: \(n=8\) Thysanoptera abundance measurements on each line
data(gathmann.bt)

ggplot(gathmann.bt, aes(x = gen, y = thysan)) + geom_boxplot()
  • Set factor order so that the control line (ISO) is treated as the reference level in the model
  • A positive coefficient in the model will mean that Bt has higher Thysanoptera abundance than control
gathmann.bt <- mutate(gathmann.bt, gen = relevel(gen, ref = 'ISO'))

Frequentist analysis: two-sided t-test

t.test(formula = thysan ~ gen, data = gathmann.bt)
  • We ask: is there enough evidence to reject the null hypothesis that the two means are equal?
  • Only considers the likelihood, which is calculated from the data
  • Does not formally incorporate any prior knowledge

Bayesian analyses with different priors

  • The response variable ranges from ~4 to 18, so an effect of +5 or -5 is “big”
  • We’re going to fit the model with 7 different priors
  • Default completely uninformative “flat” prior assigns equal prior probability to any effect size
  • Not realistic for most situations
  • We usually at least know the order of magnitude of expected effect size, even if we don’t know if it will be positive or negative

Six other possible prior distributions

prior distribution belief
normal(0,5) We don't know if Bt has a positive or negative effect and we think there's a moderate probability of a large effect in either direction
normal(0,1) We don't know if Bt has a positive or negative effect and we think large effects in either direction are very improbable
normal(5,5) We think Bt has a positive effect but we're very uncertain
normal(5,1) We think Bt has a positive effect and we're very certain of that
normal(-5,5) We think Bt has a negative effect but we're very uncertain
normal(-5,1) We think Bt has a negative effect and we're very certain of that

Fit the models

  • Defaults to 4 Markov chains, 1000 warmup iterations, 2000 total iterations per chain
  • No need to increase sampling iterations to fully characterize posterior
# Flat prior
m1gathmann <- brm(thysan ~ gen, data = gathmann.bt, 
                  seed = 838, file = 'fits/m1gathmann')
# Prior saying that extreme differences between genotypes are unlikely
m2gathmann <- brm(thysan ~ gen, data = gathmann.bt,
                  prior = prior(normal(0, 5), class = b),
                  seed = 838, file = 'fits/m2gathmann')
# Prior with even stricter regularization
m3gathmann <- brm(thysan ~ gen, data = gathmann.bt,
                  prior = prior(normal(0, 1), class = b),
                  seed = 838, file = 'fits/m3gathmann')
# Prior if we think that Bt is more likely to have a positive impact on non-target species but we are uncertain
m4gathmann <- brm(thysan ~ gen, data = gathmann.bt,
                  prior = prior(normal(5, 5), class = b),
                  seed = 838, file = 'fits/m4gathmann')
# Prior if we think that Bt is more likely to have a positive impact on non-target species and we have strong prior evidence for that
m5gathmann <- brm(thysan ~ gen, data = gathmann.bt,
                  prior = prior(normal(5, 1), class = b),
                  seed = 838, file = 'fits/m5gathmann')
# Prior if we think that Bt is more likely to have a negative impact on non-target species but we are uncertain
m6gathmann <- brm(thysan ~ gen, data = gathmann.bt,
                  prior = prior(normal(-5, 5), class = b),
                  seed = 838, file = 'fits/m6gathmann')
# Prior if we think that Bt is more likely to have a negative impact on non-target species and we have strong prior evidence for that
m7gathmann <- brm(thysan ~ gen, data = gathmann.bt,
                  prior = prior(normal(-5, 1), class = b),
                  seed = 838, file = 'fits/m7gathmann')

Comparison of posterior estimates

  • Pull out the posterior estimates for the difference between the means
  • Plot the median of the posterior with 95% credible interval
get_post <- function(fit) {
  fixedeff <- summary(fit)$fixed
  setNames(fixedeff['genBt', c('Estimate','l-95% CI', 'u-95% CI')], c('median', 'lower95', 'upper95'))
}

gathmann_post_estimates <- cbind(
  prior = c('flat', 'normal(0,5)', 'normal(0,1)', 'normal(5,5)', 'normal(5,1)', 'normal(-5,5)', 'normal(-5,1)'),
  map_dfr(list(m1gathmann, m2gathmann, m3gathmann, m4gathmann, m5gathmann, m6gathmann, m7gathmann), get_post)
) %>%
  mutate(prior = factor(prior, levels = unique(prior)))

ggplot(gathmann_post_estimates, aes(x = prior, y = median, ymin = lower95, ymax = upper95)) +
  geom_pointrange() +
  geom_hline(yintercept = 0, linetype = 'dashed', linewidth = 1) +
  labs(y = 'effect of Bt on Thysanoptera abundance')

Comparison of posterior estimates

  • \(\text{Normal}(0,5)\), \(\text{Normal}(5,5)\), and \(\text{Normal}(-5,5)\): all similar inference to the flat prior
  • \(\text{Normal}(5, 1)\): similar point estimate to flat prior, but narrower uncertainty interval (informative prior)
  • \(\text{Normal}(0, 1)\): informative prior that assigns a very low prior probability to strong positive and strong negative effects
  • \(\text{Normal}(-5, 1)\): informative prior that assigns low prior probability to everything except strong negative effects
  • The informative priors with low standard deviations influence our conclusions

Comparison of posterior estimates: Bayesian-style p-values

  • “maximum a posteriori” p-value, or pMAP (see Lesson 1)
  • Ratio of posterior probability of null value to probability of posterior point estimate
  • The lower pMAP is, the stronger the evidence in favor of the posterior estimate relative to null
gathmann_pvalues <- data.frame(
  prior = c('flat', 'normal(0,5)', 'normal(0,1)', 'normal(5,5)', 'normal(5,1)', 'normal(-5,5)', 'normal(-5,1)'),
  pMAP = c(p_map(m1gathmann)$p_MAP[2], p_map(m2gathmann)$p_MAP[2], p_map(m3gathmann)$p_MAP[2], p_map(m4gathmann)$p_MAP[2], p_map(m5gathmann)$p_MAP[2], p_map(m6gathmann)$p_MAP[2], p_map(m7gathmann)$p_MAP[2])
)

Influence of priors: recap

  • Prior can be influential if it has a narrow distribution
  • Even though \(\text{Normal}(0, 1)\) assigns equal prior probability to positive and negative outcomes, it’s informative because it discounts large effects
  • But just because Bayesian result does not agree with frequentist maximum likelihood result, that doesn’t mean it’s wrong!
  • Maybe the skeptical prior with low probability of extreme differences is actually a good thing!
  • This is an extreme example: it’s never good to draw big conclusions from a tiny dataset anyway
  • Ideal to document the influence of the prior and let the reader decide!

Generalized linear mixed models

  • So far we have fit simple models where Bayesian and frequentist methods both work well
  • Let’s explore some territory where Bayesian methods really shine!

Review: general linear mixed models

\[y \sim \text{Normal}(Xb + Zu, \sigma), u \sim \text{Normal}(0, \sigma_u)\]

  • This means response \(y\) is a function of fixed effects and random effects
    • \(X\): the predictor variables
    • \(b\): the fixed effect parameters
    • \(Z\): model matrix that says which group (block) each observation belongs to
    • \(u\): the random effect parameters
    • \(\sigma\): normally distributed noise

Review: general linear mixed models

\[y \sim \text{Normal}(Xb + Zu, \sigma), u \sim \text{Normal}(0, \sigma_u)\]

  • Random effects \(u\) are normally distributed too
  • Some random effects are positive, others are negative, a few are far from zero, most are close to zero
  • Same goes for the model residuals

Non-normal data: why is it a problem?

  • Not all datasets fit neatly into the generalized linear model box
  • Many response variables can’t be “forced” to have normally distributed residuals
    • For example, what if y is a binary variable with only 0 and 1 values?
    • Or what if it is a highly skewed concentration variable with many values at or near zero, and a few high positive values?
  • In those cases, you cannot get residuals that are normally distributed, and centered at zero, with about half of them positive and half of them negative

Non-normal data: making predictions

  • Predictions made from a general linear mixed model fit to non-normal data often don’t make sense
    • Model fit to binary data may give you predicted probabilities greater than 1 or less than 0
    • Model fit to the skewed data may give you negative predictions
  • Even if you don’t care about prediction but only about inference, this is still a problem
  • If your model doesn’t make good predictions, it doesn’t fit the data well, and you need a good fit to make inference!

Dealing with non-normal data: transforming the data

  • Old school solution: transform the data!
  • Not ideal because it introduces bias (possible exception: log transformation)
  • We should find a model that fits the data, instead of torturing the data to make it fit the model
  • Better alternative: generalized linear models!

Generalized linear mixed models: an introduction

\[g(y) \sim \text{Distribution}(Xb + Zu | \theta), u \sim \text{Normal}(0, \sigma_u)\]

  • The distribution can be any of a wide variety of distributions, with parameters \(\theta\)
  • The left side now says \(g(y)\) instead of just \(y\): link function
    • Gives us the relationship between the distribution function and the predicted variable
    • Converts variables from the scale of the statistical distribution, to the scale of the data

Generalized linear model example: Orobanche seed germination

What is a generalized linear mixed model?

  • generalized: it allows for response distributions that aren’t normal
  • linear: Y is modeled as a linear combination of effects (multiply each variable by something, then add them together)
  • mixed: it includes both fixed and random effects
  • This example will be generalized and linear, but not mixed (fixed effects only)

Meet the example data

Image (c) Wikimedia Commons, user Eitan_f
  • crowder.seeds example dataset from agridat package
  • Orobanche aegyptiaca: parasitic plant with no chlorophyll, gets its energy from other plants’ roots
  • It needs a host to survive so the seeds do not germinate until they detect a nearby host by sensing chemicals in soil
  • 2×2 factorial experiment
    • Two Orobanche genotypes ('O73' and 'O75')
    • Treated with two different extracts (bean and cucumber)

Experimental design

  • Many plates, each with a variable number of seeds from one treatment combination
  • Completely randomized design: no blocks, so no random effects
    • n: total number of seeds on the plate
    • germ: the number of seeds that germinated
  • Compute and plot p (proportion of germinated seeds)
data(crowder.seeds)

crowder.seeds <- mutate(crowder.seeds, p = germ/n)

ggplot(crowder.seeds, aes(x = gen, y = p, color = extract)) +
  geom_boxplot() + colorblind_friendly

Fit a linear model to the proportions

  • This is OK in many cases if proportions are between ~0.05 and ~0.95
  • Assumption of normally distributed error can still work out
  • Because proportions are between 0 and 1, \(\text{Normal}(0, 1)\) prior on categorical fixed effect is completely uninformative (it encompasses any possible effect size)
crowder_lm <- brm(p ~ gen * extract, 
                  prior = c(
                    prior(normal(0, 1), class = b)
                  ),
                  data = crowder.seeds, 
                  seed = 111, file = 'fits/crowder_lm')

Fit a generalized linear model to the binary outcomes

  • Outcome is binary (0 = seed did not germinate, 1 = seed germinated)
  • Response variable is specified with two parts
    • Number of successes (germinated seeds in each dish)
    • Total number of trials (all seeds in each dish including the ones that didn’t germinate)
  • germ | trials(n)

Response family

  • family = <distribution function>(link = '<link function>')
  • What is the response distribution?
  • And what is the link function that determines the relationship between the observed data scale and the model prediction scale?
  • Here we are going to use family = binomial(link = 'logit')
    • Binomial is a statistical distribution that has two possible values, 0 or 1
    • But what is that “logit” thing?

Logit = log odds

\[\text{logit}~p = \log \frac {p}{1-p}\]

  • Maps values ranging from 0 to 1 to values ranging from \(-\infty\) to \(+\infty\)
  • More or less a straight line when the p is between about 0.25 and 0.75, gets steep close to 0 and 1
  • If you have an event that has an extremely low (or high) chance of happening, a little increase (or decrease) to the probability will multiply the odds by a lot

graph of logit function

GLM code

  • normal(0, 5) prior is sensible for fixed effects on the log odds scale
  • Again we will use all default model fitting options
crowder_glm <- brm(germ | trials(n) ~ gen * extract, 
                   family = binomial(link = 'logit'),
                   prior = c(
                     prior(normal(0, 5), class = b)
                   ),
                   data = crowder.seeds, 
                   seed = 112, file = 'fits/crowder_glm')

Comparing LM and GLM

What do you notice about the posterior predictive check plots?

  • thick black line: distribution of the observed data
  • each thin blue line: distribution of the predicted values from one random draw from the posterior
pp_check(crowder_lm)
pp_check(crowder_glm)

LM and GLM: Posterior predictive check plots

  • Both are not too bad: shape of the fitted values is similar to shape of the observed data
  • Scale of x-axis is different
  • But notice some predictions from LM are <0 and >1
  • Why? Why is this a problem?

LM: Estimating marginal means

  • pairwise ~ extract | gen: within each level of genotype (gen), estimate the predicted mean for each level of extract and take the pairwise contrast between them
  • LM contrasts are taken by subtracting one modeled proportion from the other
emm_crowder_lm <- emmeans(crowder_lm, pairwise ~ extract | gen)

emm_crowder_lm$emmeans
emm_crowder_lm$contrasts

GLM: Estimating marginal means

  • Estimated marginal means on the logit scale
  • Contrasts on the log odds ratio scale
emm_crowder_glm <- emmeans(crowder_glm, pairwise ~ extract | gen)

emm_crowder_glm$emmeans
emm_crowder_glm$contrasts

Estimated marginal means on the response scale

  • The argument type = 'response' transforms the results back to the data scale
  • The logit link function in R is qlogis() and its inverse is plogis()
emmresponse_crowder_glm <- emmeans(crowder_glm, pairwise ~ extract | gen, type = 'response')

emmresponse_crowder_glm$emmeans
emmresponse_crowder_glm$contrasts

Odds ratios

  • Now estimated marginal means and credible intervals are probabilities
  • Contrasts are odds ratios
    • Ratio of the odds of germination between the two levels of extract
    • An odds ratio of 1 indicates no difference between the odds
    • The ratio is bean / cucumber so we have evidence that bean extract is less effective at stimulating germination than cucumber extract

Hypothesis testing

  • What if we want to do a test relative to a null hypothesis?
  • Use describe_posterior() from the bayestestR package
  • Describe the posterior distributions of the differences between the means of the extract treatments within each genotype for the general linear model
  • Describe the posterior distributions of the odds ratios of the means for the generalized linear model

describe_posterior() explained

  • We have 1000 sampling iterations from each of 4 chains: total of 4000 posterior draws for the distribution of each contrast
  • By default describe_posterior() gives us the median of the 4000 values as the point estimate
  • Specify equal-tailed credible interval using ci_method = 'eti', default width 95%
    • We get 2.5% and 97.5% quantiles of the 4000 values

Bayesian-style p-values revisited

  • Request two different null hypothesis tests (see Lesson 1)
    • p_map: the ratio of the probability of the null value to the most probable posterior value (point estimate)
    • pd (probability of direction): the percent of the posterior distribution that has the same direction (sign) as the point estimate; the closer to 100%, the more evidence in favor of a non-null difference
    • We have to specify null = 1 for the odds ratio contrasts (ratio of 1 is no difference)
describe_posterior(emm_crowder_lm$contrasts, ci_method = 'eti', test = c('pd', 'p_map'))
describe_posterior(emmresponse_crowder_glm$contrasts, ci_method = 'eti', test = c('pd', 'p_map'), null = 1)

Comparing inference from LM and GLM

  • The generalized linear model provides stronger evidence in favor of differences
  • It is taking into account the greater precision we can get from having many partially independent binary outcomes (seeds) in each dish
  • 200/1000 is more evidence than 2/10 (both are treated the same in the linear model fit to proportions)

Assessing evidence for interaction between treatments

  • Is the difference between the response to bean and cucumber extract different between the two genotypes?
  • Call the contrast() function and specify two levels of interaction contrast
  • First contrast the extracts within each genotype, then contrast those two contrasts with each other
    • 'revpairwise': reverse pairwise
emmeans(crowder_glm, ~ extract + gen, type = 'response') %>%
  contrast(interaction = c('revpairwise', 'pairwise')) %>%
  describe_posterior(ci_method = 'eti', test = c('pd', 'p_map'))

Presentation ideas: How to report this analysis in a manuscript?

  • Description of the statistical model for the methods section
  • Verbal description of the results
  • Figure
  • Table

Methods section

We fit a generalized linear model that modeled the germination probability for each dish as a function of extract type, genotype, and their interaction. The model assumed a binomial response distribution and logit link function. The model was fit with Hamiltonian Monte Carlo. Four chains were run, each with 1000 warmup iterations and 1000 sampling iterations. We verified model convergence by ensuring the R-hat statistics for all parameters were below 1.01. We assessed model fit by examining the posterior predictive check plot. We obtained posterior estimates of the marginal mean for each combination of extract and genotype, and of the odds ratio between each extract type within each genotype. We calculated the median and 95% equal-tailed credible interval for each estimated mean and odds ratio. To assess evidence for differences between extracts, we calculated the maximum a posteriori p-value (pMAP) and probability of direction (pd).

Results section

The modeled probability of germination was greater when seeds were treated with cucumber extract versus bean extract for both the O73 genotype (bean extract: posterior median 0.396, 95% CI [0.316, 0.478]; cucumber extract: 0.531 [0.458, 0.614]) and the O75 genotype (bean extract: 0.364 [0.305, 0.421]; cucumber extract: 0.681 [0.629, 0.734]). The odds ratio between cucumber and bean extract was smaller for the O73 genotype than the O75 genotype (ratio 0.46 [0.25, 0.83], pMAP < 0.001), indicating a greater relative effect of cucumber extract on germination probability in the O75 genotype.

Manuscript-quality figure

  • Show the median and several different credible intervals of the distributions of posterior means for each treatment combination
  • Use gather_emmeans_draws() from the tidybayes package to pull out the individual posterior draws from the emmeans object
  • Use stat_interval() to simultaneously compute our desired credible intervals and plot them
  • We need to transform the means from the logit scale to the probability scale using mutate(.value = plogis(.value))
    • gather_emmeans_draws() does not preserve the transformation information
gather_emmeans_draws(emmresponse_crowder_glm$emmeans) %>%
  mutate(p = plogis(.value)) %>%
  ggplot(aes(x = gen, y = p, group = interaction(gen, extract))) +
    stat_interval(aes(color = extract, color_ramp = after_stat(level)), .width = c(0.5, 0.8, 0.95), position = position_dodge(width = .3)) +
    stat_summary(fun = 'median', geom = 'point', size = 2, position = position_dodge(width = .3)) +
    colorblind_friendly +
    labs(x = 'Genotype', y = 'Modeled germination probability', color_ramp = 'CI width', color = 'Extract')

Manuscript-quality table

  • Present the medians (95% CI) of the means and the odds ratios for the two extracts within each genotype
  • Some rearranging needed to get a compact wide format
  • median_qi() from tidybayes summarizes posterior draws with a median point estimate and a quantile credible interval of a given width
  • Again we need to transform the means with plogis() to go from logit scale to probability scale
  • Transform the contrasts with exp() to go from log odds ratio scale to odds ratio scale
means_summary <- gather_emmeans_draws(emmresponse_crowder_glm$emmeans) %>%
  mutate(.value = plogis(.value)) %>%
  median_qi(.width = 0.95) %>%
  pivot_wider(id_cols = gen, names_from = extract, values_from = c(.value, .lower, .upper))

contrasts_summary <- gather_emmeans_draws(emmresponse_crowder_glm$contrasts) %>%
  mutate(.value = exp(.value)) %>%
  median_qi(.width = 0.95)

left_join(means_summary, contrasts_summary) %>%
  select(-c(contrast, .width, .point, .interval)) %>%
  gt(caption = 'Germination probability') %>%
  fmt_number(decimals = 3) %>%
  cols_merge(c(.value_bean, .lower_bean, .upper_bean), pattern = '{1} [{2}, {3}]') %>%
  cols_merge(c(.value_cucumber, .lower_cucumber, .upper_cucumber), pattern = '{1} [{2}, {3}]') %>%
  cols_merge(c(.value, .lower, .upper), pattern = '{1} [{2}, {3}]') %>%
  cols_label(gen = 'Genotype', .value_bean = 'Bean extract', .value_cucumber = 'Cucumber extract', .value = 'Bean:cucumber odds ratio')

Generalized linear mixed model example: turnip yield

Generalized linear mixed models

  • GLMM: non-normal data and a mixture of fixed and random effects
  • Bayesian GLMMs really outperform frequentist
  • Model syntax requires random terms on right-hand side of formula, and to specify response family and link function

Meet the example data

  • mcconway.turnip from agridat
  • Continuous outcome (yield)
  • Asymmetrical distribution; yield must be > 0
  • Positive, right-skewed distributions are super common in ag science
data(mcconway.turnip)
ggplot(mcconway.turnip, aes(x = yield)) + geom_histogram()

Image (c) thebittenword.com

Turnip experimental design

  • 2×2×4 factorial design experiment, with three factors
    • genotype (gen, either "Barkant" or "Marco")
    • planting date (date, either "21Aug1990" or "28Aug1990")
    • planting density (density, 1, 2, 4, or 8 kg/ha)
  • Randomized complete block design with four blocks
  • Each block has 16 plots, one for each treatment combination
  • A single measurement of yield was taken from each plot
  • Treat density as a discrete variable with four levels
  • This is easier because we don’t need to worry about whether the effect of planting density on yield is linear
mcconway.turnip <- mutate(mcconway.turnip, density = factor(density))
  • Boxplot of the yield values in each treatment combination
  • Split the genotypes into separate panels using facet_wrap()
  • Arrange the boxplots in increasing order of density on the x-axis
  • Show the planting dates for each density with a different color box
ggplot(mcconway.turnip, aes(x = density, y = yield, group = interaction(density, date), color = date)) +
  facet_wrap(~ gen) +
  geom_boxplot() +
  colorblind_friendly +
  labs(x = 'Planting density (kg/ha)', y = 'Yield', color = 'Planting date')

Yield distribution

  • Interaction between planting date and density
  • Possible interaction with genotype
  • Relationship between mean and variance
  • Linear model assumes one single standard deviation for all the residuals, so unequal variance can cause bad predictions

Fit a linear mixed model

  • Pretend we don’t care about the unequal variance
  • Include all main effects, two-way interactions, and three-way interactions: gen * date * density
  • + (1 | block): random intercept for each block
  • Yield varies between 0 and 25 so we will use normal(0, 5) as a prior on fixed effects
  • All default MC sampling options
mcconway_lmm <- brm(yield ~ gen * date * density + (1 | block), 
                    prior = c(
                      prior(normal(0, 5), class = b)
                    ),
                    data = mcconway.turnip, 
                    seed = 113, file = 'fits/mcconway_lmm')

Side note: divergent transitions

  • You may see a warning about divergent transitions
  • In this case they are few enough that we can ignore them
  • They often occur when fitting complex models to small datasets
  • May indicate numerical problems with the sampler; it might be “missing” parts of the posterior distribution
  • Fix this by setting narrower priors or tweaking the sampler arguments
  • Make the sampler take smaller steps with each iteration, which reduces chance of divergent transition but is slower

Posterior predictive check plot on linear model

pp_check(mcconway_lmm)
  • What is our model getting wrong?
  • Also look at estimated marginal means by density
emmeans(mcconway_lmm, ~ density)

Fit a linear mixed model to the log-transformed data

  • Old-school way: log-transform the response variable then fit the model
    • Makes positive right-skewed data more symmetrical
    • Ensures that all model predictions are positive: \(e^x > 0\)
    • Stabilizes variance: flattens out relationship between mean and variance
  • Same syntax, just put log() around the response variable
  • normal(0, 2) prior because yield only varies by 2 to 3 units on the log scale
mcconway_loglmm <- brm(log(yield) ~ gen * date * density + (1 | block), 
                       prior = c(
                         prior(normal(0, 2), class = b)
                       ),
                       data = mcconway.turnip, 
                       seed = 114, file = 'fits/mcconway_loglmm')

Posterior predictive check plot: LMM with log-transformation

pp_check(mcconway_loglmm)

Gamma distribution

  • The log transformation fixed the problems with skew and unequal variance … right?
  • Yes, but transformation and back-transformation introduces bias
  • Better to keep data on original scale and pick a response distribution that actually fits the data
  • Gamma is often a good go-to option for positive right-skewed data
  • Its variance is proportional to its mean so it can deal with the unequal variance issue

Fit a GLMM with gamma response distribution

Fit a GLMM with gamma response distribution and log link function using family = Gamma(link = 'log').

mcconway_glmm <- brm(yield ~ gen * date * density + (1 | block), 
                     family = Gamma(link = 'log'), 
                     prior = c(
                       prior(normal(0, 2), class = b)
                     ),
                     data = mcconway.turnip, 
                     seed = 115, file = 'fits/mcconway_glmm')

Posterior predictive check: Gamma GLMM

pp_check(mcconway_glmm)

Estimating marginal means: Gamma GLMM

  • Marginal means for each combination of fixed effects
  • The means and CIs do not include random effect of block
  • Expected value for a block that’s “smack-dab in the middle”
  • type = 'response' gives predictions on data scale, not log link scale
emm_mcconway <- emmeans(mcconway_glmm, ~ density + date + gen, type = 'response')

emm_mcconway

Contrasts: Gamma GLMM

  • For example, pairwise contrasts between density levels within each combination of planting date and genotype
  • Contrasts are ratios
  • 'revpairwise' so that the higher density is in the numerator
contrast(emm_mcconway, 'revpairwise', by = c('date', 'gen'))

Assessing evidence for interactions between treatments

  • Effect sizes look bigger for Marco than Barkant
  • Interaction contrasts (contrasts of contrasts) to test this
contrast(emm_mcconway, 'revpairwise', by = 'date', interaction = 'revpairwise')

Presentation ideas: How to report this analysis in a manuscript?

  • Description of the statistical model for the methods section
  • Text write-up for the results section
  • A figure

Methods section

We fit a generalized linear mixed model that modeled turnip yield as a function of genotype, planting date, planting density (discrete variable with four levels), and their interactions. The model had a gamma response distribution with log link function. A random intercept was fit to each block. The model was fit with Hamiltonian Monte Carlo. Four chains were run, each with 1000 warmup iterations and 1000 sampling iterations. We verified model convergence by ensuring the R-hat statistics for all parameters were below 1.01. We assessed model fit by examining the posterior predictive check plot. We obtained posterior estimates of the marginal mean for each combination of genotype, planting date, and planting density. We also estimated the yield ratio between all pairs of density treatments within each combination of planting date and genotype. We calculated the median and 95% equal-tailed credible interval for each estimated mean and ratio.

Results section

We found evidence that turnip yield was highest at the highest levels of planting density. The effect of density was highest for the later planting date. There was not strong evidence for a different effect of planting density between the genotypes; it was positive for both. Mean yield for density level 8 kg/ha at the later planting date was 14.66 (95% CI [4.28, 31.27]) for the Barkant genotype, versus 3.82 [1.14, 8.05] at 1 kg/ha; the ratio of yields between highest and lowest density levels was 3.84 [1.32, 7.60]. For the Marco genotype, the mean yield at later planting date was 9.80 [3.32, 21.01] for 8 kg/ha density and 1.31 [0.40, 2.82] for 1 kg/ha density; the yield ratio between the two density levels was 7.45 [2.63, 14.81].

Manuscript-quality figure

  • exp() is used to back-transform the posterior estimates
gather_emmeans_draws(emm_mcconway) %>%
  mutate(.value = exp(.value)) %>%
  ggplot(aes(x = density, y = .value, group = interaction(density, date))) +
    stat_interval(aes(color = date, color_ramp = after_stat(level)), position = position_dodge(width = .5)) +
    stat_summary(fun = 'median', geom = 'point', size = 2, position = position_dodge(width = .5)) +
    facet_wrap(~ gen) +
    colorblind_friendly +
    labs(x = 'Planting density (kg/ha)', y = 'Modeled yield', color_ramp = 'CI width', color = 'Planting date')

Conclusion

We’ve made it to the end of Lesson 2! Let’s revisit the learning objectives!

We’ve learned . . .

  • How to explore different priors and how they may influence (or not influence) your conclusions
  • How to fit a Bayesian generalized linear model (GLM) for non-normal response data
  • How to fit a Bayesian generalized linear mixed model (GLMM) for non-normal response data with random effects
  • Techniques from packages brms, emmeans, tidybayes, and bayestestR

Give yourself a well-deserved round of applause!

  • See text version of lesson for further reading and useful resources
  • Please provide feedback to help me improve this course!
    • See course homepage for link to feedback form