When done properly, power analysis:
If that doesn’t motivate you to learn about power analysis, I don’t know what will!
You will get the most out of this course if you …
...
with code)
POWER: If an effect exists, the chance that our study will find it
Tradition!
t-test with \(n = 50\) per group and a medium effect size
Some are known and some need to be estimated:
Effect | Effect size metric | Description | Example |
---|---|---|---|
Difference of the mean of two groups | Cohen’s d | The difference between the two means divided by their combined standard deviation. How many standard deviations greater is one mean than the other? | The effect of a management intervention on biomass yield |
Difference in a binary outcome between two groups | Odds ratio | The ratio of the odds of the outcome happening in one group, to the odds of it happening in another group | The effect of a seed treatment on seed germination rate |
Strength of relationship between two continuous variables | Pearson correlation coefficient r | You should know this one! | The effect of temperature on soil respiration |
Variance explained by treatment group when comparing three or more groups | Cohen’s f | (Square root of) the ratio of variance explained by group, to the residual variance not explained by group | The effect of multiple genotypes and environments on a trait |
R packages:
SAS procedures:
proc power
proc glmpower
proc glimmix
besag.beans
dataset from agridat packageanova()
function calculates sums of squares for one-way ANOVA of yield by genotypecohens_f()
from effectsize package
Formula for Cohen’s d effect size:
\[d = \frac{\bar{x}_1 - \bar{x}_2}{s}\]
Formula for \(s\), the pooled standard deviation:
\[s = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2} {n_1 + n_2 - 2}}\]
Examine the first few rows of the data
abs()
of effect sizes because we are interested in detecting either direction of effect with a two-sided testBt_d_quantiles <- quantile(abs(Bt_Cohen_d), probs = c(.25, .5, .75))
ggplot(data.frame(d = abs(Bt_Cohen_d)), aes(x = d)) +
geom_histogram(bins = 10) +
geom_vline(xintercept = Bt_d_quantiles[1], color = 'forestgreen', linewidth = 1) +
geom_vline(xintercept = Bt_d_quantiles[2], color = 'darkorange', linewidth = 2) +
geom_vline(xintercept = Bt_d_quantiles[3], color = 'forestgreen', linewidth = 1) +
theme_bw()
round(Bt_d_quantiles, 3)
d | f | Effect size |
---|---|---|
0.20 | 0.10 | Small |
0.50 | 0.25 | Moderate |
0.80 | 0.40 | Large |
help(package = 'WebPower')
wp.anova()
to calculate power for a one-way ANOVA designwp.anova()
In all cases assume k = 4
groups being compared
k = 4
: number of groupsf = 0.25
: moderate Cohen’s f effect sizealpha = 0.05
: typical significance levelpower = 0.8
: typical power to targettype = 'overall'
: power for the overall F-statistic in the ANOVA
type
arguments are possible, for example the power to detect a difference in two specific group meansf
this time, but specify n = 60
f = 0.4
n = 60
samplespower
ns
wp.anova()
passing the vector of many n
and f
values to the function$power
column from the output of wp.anova()
and add it to our original data frameggplot(n_f_grid, aes(x = n, y = power)) +
geom_line(aes(group = f, color = factor(f)), linewidth = 0.8) +
geom_point(aes(fill = factor(f)), size = 3, shape = 21) +
geom_hline(yintercept = 0.8, linetype = 'dashed', linewidth = 1) +
scale_color_brewer(name = "Cohen's f", palette = 'Reds', aesthetics = c('color', 'fill')) +
scale_x_continuous(breaks = ns) +
theme_bw()
anova_MDES <- expand.grid(n = ns)
anova_MDES$MDES <- map_dbl(anova_MDES$n, ~ wp.anova(k = 4, n = ., alpha = 0.05, power = 0.8, type = 'overall')$f)
ggplot(anova_MDES, aes(x = n, y = MDES)) +
geom_line(linewidth = 0.8) + geom_point(size = 3) +
geom_hline(yintercept = 0.25, linetype = 'dashed', linewidth = 1) +
scale_x_continuous(breaks = ns) +
theme_bw()
rnorm()
t.test()
with all default options to compare means of the two random samplesfor
loopp_vals
, a vector of zerosfor
loop executes the same code 10000 times, increasing the value of i
by 1 each time
rnorm()
i
th element of p_vals
p < 0.05
= power!wp.t()
linder.wheat
from agridat package: wheat yield trial in Switzerland
gen
, environment env
, G by E interaction gen:env
(1 | block:env)
joint_tests()
: ANOVA table with F-testsVarCorr()
: variances for block and for individual plots (residuals)lmer()
extend()
extend()
creates a new model object to simulate from
along = 'block'
means add new levels to block (n = 8
)powerCurve()
along = 'block'
to simulate datasets with different numbers of blocks, up to the extended value 8test
specifies we’re testing the fixed effect of genotype using the F-testn = 10
means simulate 10 datasets for each number of blockscbpp
dataset from lme4 package
Zebu, photo by Scott Bauer, USDA-ARS
disease = 0
if uninfected and disease = 1
if infected)extend()
in two different waysalong = 'herd'
within = 'period+herd'
breaks
, specific numbers of cattle to test, otherwise default is usedcbpp_pcurve_n_herds <- powerCurve(cbpp_glmm_n_herds,
along = 'herd',
test = fixed('period', method = 'chisq'),
nsim = 10,
seed = 4)
cbpp_pcurve_n_within_herd <- powerCurve(cbpp_glmm_n_within_herd,
within = 'period+herd',
test = fixed('period', method = 'chisq'),
nsim = 10,
seed = 5,
breaks = c(3, 5, 10, 15, 25, 50))
No power = no new knowledge = no science
quentin.read@usda.gov
!