... with code)dbeta() uses those parameters but calls them shape1 and shape2mu is the mean parameterphi is the precision parametersnijders.fusarium from agridat packagelibrary(brms)
library(agridat)
library(tidyverse)
library(emmeans)
library(tidybayes)
library(ggdist)
library(bayestestR)
library(gt)
library(fitdistrplus)
theme_set(theme_bw())
colorblind_friendly <- scale_color_manual(values = unname(palette.colors(5)[-1]), aesthetics = c('colour', 'fill'))
options(mc.cores = 4, brms.file_refit = 'on_change') # If you don't have CmdStan
options(mc.cores = 4, brms.file_refit = 'on_change', brms.backend = 'cmdstanr') # If you do have CmdStangen, strain, and year to factor data typeggplot(snijders.fusarium, aes(x = year, y = p, color = strain)) +
geom_point(position = position_jitterdodge(jitter.width = 0.1, jitter.height = 0, dodge.width = 0.5), alpha = 0.5) +
stat_summary(fun = 'median', geom = 'point', size = 15, shape = 95, position = position_dodge(width = 0.5)) +
colorblind_friendlyp ~ strain * year + (1 | gen) + (1 | gen:year)
p on the leftfamily = Beta(link = 'logit', link_phi = 'log')
phimu, but only fits one global precision parameter phiphi to vary with fixed and/or random effects give us a better fit?brm() we specify a separate formula for each model parameter, wrapped in bf()phi ~ strain * year means the precision parameter has fixed effects on it toophi, using the argument dpar = phi
dpar stands for “distributional parameter”fusarium_beta_fit_varyphi <- brm(
bf(
p ~ strain * year + (1 | gen) + (1 | gen:year),
phi ~ strain * year
),
family = Beta(link = 'logit', link_phi = 'log'),
data = snijders.fusarium,
prior = c(
prior(normal(0, 5), class = b),
prior(normal(0, 2), class = b, dpar = phi)
),
chains = 4, iter = 2000, warmup = 1000, seed = 777,
file = 'fits/fusarium_beta_fit_varyphi'
)bf() syntaxbf() syntax in a typical general linear model with normally distributed error
sigma can vary with fixed or random effects, if you want to model unequal variance among groupspp_check() and model comparisonWhat can we conclude?
phitype = 'response' means use inverse link scale (the same scale as the original data)regrid(fusarium_emms, transform = 'log') to get the contrasts in the form of ratios, not odds ratiosdescribe_posterior() from the bayestestR package to get credible intervals for the means and pairwise differences
We fit a GLMM with beta response distribution to the proportion Fusarium infection data. A logit link function was used for the mean parameter, and a log link for the precision parameter \(\phi\). Strain and year were treated as fixed effects, and a random intercept and random slope with respect to year were fit for each genotype. We assigned Normal(0, 5) prior distributions to the fixed effects. Hamiltonian Monte Carlo was used to sample from the posterior distribution, with four Markov chains each run for 1000 warmup iterations and 1000 sampling iterations. We verified model convergence with the R-hat statistic and graphically assessed model fit with posterior predictive check plots. We compared this model with a model that had additional fixed effects on the precision parameter \(\phi\). LOO cross-validation showed this model was no better than the model without fixed effects on \(\phi\) (\(\Delta_{ELPD}\) = 10.2 with standard error = 10.2), therefore we present the simpler model.
We obtained posterior estimates of the marginal means for each combination of strain and year, back-transformed from the logit scale, and took pairwise ratios between strain means within each year. We characterized the posterior distributions of means and ratios using the median and equal-tailed credible intervals. We teested the null hypothesis that each ratio was equal to 1 using two different Bayesian p-value analogues: probability of direction \(p_d\) and maximum a posteriori p-value \(p_{MAP}\).
Incidence of Fusarium infection was overall low except for strain F39 in 1986 and 1987, with estimated mean infection proportions 0.24 with 95% CI [0.16, 0.34] and 0.28 [0.19, 0.40] respectively. The evidence is consistent with a greater proportion infection for F39 relative to other strains, with approximately a tenfold difference in 1987 and smaller differences in the twofold to fivefold range in 1986 and 1988 (see Table).
gather_emmeans_draws(fusarium_emms) %>%
mutate(.value = plogis(.value)) %>%
ggplot(aes(x = year, y = .value, group = interaction(strain, year))) +
stat_interval(aes(color = strain, color_ramp = after_stat(level)), .width = c(.66, .95), position = ggpp::position_dodgenudge(width = .5, x = .03)) +
stat_summary(fun = 'median', geom = 'point', size = 2, position = ggpp::position_dodgenudge(width = .5, x = .03)) +
geom_point(aes(y = p, color = strain), data = snijders.fusarium, position = ggpp::position_dodgenudge(width = .5, x = -.03), alpha = .5) +
colorblind_friendly +
labs(y = 'proportion infected', color_ramp = 'CI level')fusarium_contrasts_CI %>%
filter(CI == 0.95) %>%
dplyr::select(-c(p, CI)) %>%
gt(groupname_col = 'year') %>%
fmt_number(n_sigfig = 3) %>%
cols_label(Median = 'ratio', CI_low = '95% CI lower bound', CI_high = '95% CI upper bound', p_MAP = md('p~MAP~')) %>%
sub_zero(columns = p_MAP, zero_text = '< 0.001') %>%
tab_options(row_group.background.color = 'slateblue3',
column_labels.font.weight = 'bold')
Images (c) Philippa Willitts and Kirby Lee/USA Today Sports
LOD/2log(x + 1)? log(x + 0.0000001)?
Images from Li livres dou santé (13th c.) and Miracles de Notre Dame (15th c.)