Here is one way to do it. We get the posterior estimates of the
ratios using contrast()
and then pipe that result into
describe_posterior()
. Note it is very important to specify
null = 1
. The default is null = 0
which will
give incorrect results because a ratio of 0 is not an appropriate null
value.
contrast(emm_mcconway, 'revpairwise', by = c('date', 'gen')) %>%
describe_posterior(test = c('pd', 'p_map'), null = 1)
## Summary of Posterior Distribution
##
## Parameter | Median | 95% CI | p (MAP) | pd
## -------------------------------------------------------------------------------
## density2/density1 21Aug1990 Barkant | 1.33 | [0.64, 2.83] | 0.973 | 77.30%
## density4/density1 21Aug1990 Barkant | 1.45 | [0.70, 3.02] | 0.746 | 84.23%
## density4/density2 21Aug1990 Barkant | 1.10 | [0.49, 2.39] | 0.935 | 58.53%
## density8/density1 21Aug1990 Barkant | 1.89 | [0.94, 4.00] | 0.381 | 96.38%
## density8/density2 21Aug1990 Barkant | 1.43 | [0.67, 3.15] | 0.858 | 82.05%
## density8/density4 21Aug1990 Barkant | 1.31 | [0.60, 2.91] | > .999 | 75.48%
## density2/density1 28Aug1990 Barkant | 1.37 | [0.63, 3.01] | 0.902 | 79.30%
## density4/density1 28Aug1990 Barkant | 3.30 | [1.55, 7.09] | 0.042 | 99.85%
## density4/density2 28Aug1990 Barkant | 2.43 | [1.08, 5.50] | 0.243 | 98.30%
## density8/density1 28Aug1990 Barkant | 3.86 | [1.78, 8.36] | 0.026 | 99.98%
## density8/density2 28Aug1990 Barkant | 2.84 | [1.24, 6.24] | 0.144 | 99.12%
## density8/density4 28Aug1990 Barkant | 1.17 | [0.53, 2.51] | 0.994 | 65.53%
## density2/density1 21Aug1990 Marco | 1.81 | [0.84, 3.86] | 0.505 | 93.88%
## density4/density1 21Aug1990 Marco | 2.52 | [1.15, 5.68] | 0.194 | 98.90%
## density4/density2 21Aug1990 Marco | 1.39 | [0.61, 3.12] | 0.927 | 78.65%
## density8/density1 21Aug1990 Marco | 3.74 | [1.69, 8.13] | 0.030 | 99.98%
## density8/density2 21Aug1990 Marco | 2.07 | [0.95, 4.81] | 0.389 | 96.45%
## density8/density4 21Aug1990 Marco | 1.50 | [0.67, 3.35] | 0.886 | 83.30%
## density2/density1 28Aug1990 Marco | 1.69 | [0.77, 3.77] | 0.645 | 90.10%
## density4/density1 28Aug1990 Marco | 5.56 | [2.53, 12.32] | < .001 | 100%
## density4/density2 28Aug1990 Marco | 3.29 | [1.42, 7.56] | 0.077 | 99.88%
## density8/density1 28Aug1990 Marco | 7.39 | [3.49, 16.22] | < .001 | 100%
## density8/density2 28Aug1990 Marco | 4.40 | [1.98, 9.78] | < .001 | 100%
## density8/density4 28Aug1990 Marco | 1.33 | [0.61, 2.91] | 0.910 | 77.22%
Here we fit the model with a tighter prior.
crowder_glm_tightprior <- brm(germ | trials(n) ~ gen * extract,
family = binomial(link = 'logit'),
prior = c(
prior(normal(0, 0.5), class = b)
),
data = crowder.seeds,
seed = 333, file = 'fits/crowder_glm_tightprior')
There are various ways to compare the inference we get from this
model to the one fit with a wider prior on the fixed effects. One way is
to estimate the marginal means and contrasts, and compare the output of
describe_posterior()
to the output from the model with a
wider prior.
emmresponse_crowder_glm_tightprior <- emmeans(crowder_glm_tightprior, pairwise ~ extract | gen, type = 'response')
Here are the posterior estimates of the contrasts from the original model we fit with a wider prior:
describe_posterior(emmresponse_crowder_glm$contrasts, ci_method = 'eti', test = c('pd', 'p_map'), null = 1)
## Summary of Posterior Distribution
##
## Parameter | Median | 95% CI | p (MAP) | pd
## ------------------------------------------------------------
## bean/cucumber O73 | 0.58 | [0.36, 0.95] | 0.062 | 98.42%
## bean/cucumber O75 | 0.26 | [0.19, 0.38] | < .001 | 100%
And here are those estimates from the model with the tighter prior:
describe_posterior(emmresponse_crowder_glm_tightprior$contrasts, ci_method = 'eti', test = c('pd', 'p_map'), null = 1)
## Summary of Posterior Distribution
##
## Parameter | Median | 95% CI | p (MAP) | pd
## ------------------------------------------------------------
## bean/cucumber O73 | 0.57 | [0.38, 0.83] | 0.011 | 99.78%
## bean/cucumber O75 | 0.29 | [0.21, 0.40] | < .001 | 100%
You can see that the means and credible intervals are nearly identical to the ones above. It looks like our inference is not affected by reducing the standard deviation of the fixed effect prior by a factor of 10. I included this exercise to demonstrate that even for small to midsized datasets, the posterior is often very insensitive to your choice of prior. I consider that a good thing. In the example with the Bt corn dataset, the choice of prior was exceptionally influential because it was such a tiny dataset.
The boxplot indicates that years with blight tended to have more rainy days in April and May.
data(johnson.blight)
ggplot(johnson.blight, aes(x = factor(blight), y = rain.am)) + geom_boxplot()
Here is one way to fit the model. FYI, there is a family called
bernoulli
that is for the special case of binary trials
with only a single trial in each row of the dataset. An equivalent model
to this one would be
brm(blight ~ rain.am, family = bernoulli(link = 'logit'), ...)
blight_glm <- brm(blight | trials(1) ~ rain.am,
family = binomial(link = 'logit'),
prior = c(
prior(normal(0, 5), class = b)
),
data = johnson.blight,
seed = 123,
file = 'fits/blight_glm')
## Only 2 levels detected so that family 'bernoulli' might be a more efficient choice.
## Only 2 levels detected so that family 'bernoulli' might be a more efficient choice.
summary(blight_glm)
## Family: binomial
## Links: mu = logit
## Formula: blight | trials(1) ~ rain.am
## Data: johnson.blight (Number of observations: 25)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -5.82 2.31 -10.89 -2.06 1.00 2581 2138
## rain.am 0.53 0.21 0.17 0.99 1.00 2456 1960
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
The model summary indicates that there is evidence that as rain in
April and May increases, the probability of potato blight increases. The
point estimate of the coefficient is 0.53 and the 95% CI ranges between
0.17 and 0.99. Because the model is on the log-odds scale, the point
estimate can be interpreted as: For every 1 additional rainy day in
April or May, the log-odds of potato blight increases by 0.53. The
differences are log odds ratios, so this would mean that the odds are
increased by a factor of exp(0.53)
, or 1.70, for each extra
day of rain.