Why learn power analysis?
- The single most-requested topic for SEA stats training
- Many scientists are required to do power analysis and view it as an annoyance
- But power analysis is critical when designing studies (experimental or observational) that give us reliable answers to questions we care about
- If we can’t do that, we can’t do science
Motivation to learn power analysis
When done properly, power analysis:
- Helps you design your study, foreseeing any issues that might arise
- Reduces the chance of doing weak and inconclusive studies that are doomed before they even start
- Elevates your science, and your science makes the world a better place
If that doesn’t motivate you to learn about power analysis, I don’t know what will!
Prerequisites
You will get the most out of this course if you …
- Know the basic principles of experimental design (randomization and replication)
- Are familiar with statistical tests and models (t-test, analysis of variance, mixed models)
- Know a little bit about data analysis using R
How to follow the course
- Slides and text version of lessons are online
- Fill in code in the worksheet (replace
...
with code)
- You can always copy and paste code from text version of lesson if you fall behind
At the end of this course, you will understand …
- What statistical power is and why it’s important
- Why we have to trade off between false positives and false negatives
- Why power depends on sample size
- What effect size is and why statistical power depends on it
- That doing a power analysis after you’ve collected the data is pointless
- That the quality and realism of a power analysis is better, the more work you put into it
At the end of this course, you will be able to …
- Extract effect sizes from published papers or previous studies
- Calculate exact statistical power for simple study designs
- Estimate statistical power using simulation for study designs that require mixed models
A framework for statistical power
- We are working within the classical statistical framework of using evidence from the data to try to reject a null hypothesis
- We can never be 100% certain of the truth without measuring every individual in the (sometimes hypothetical) population
- We might be wrong because of natural variation, measurement error, biases in our sample, or any/all of those
Right and wrong in different ways
- Assume for now we are trying to determine the truth about a binary (yes or no) outcome
- You can be right or wrong in different ways, depending on what the truth is
- False positive = Type I error
- False negative = Type II error
![Two ways to be wrong]()
- \(Y\) is the truth, \(\hat{Y}\) is the model’s prediction
Definition of statistical power
- Power of a statistical test: the probability that the test will detect a phenomenon if the phenomenon is true
- Follows directly from the idea of true and false positives and negatives
- Power is the probability of declaring a true positive if the null hypothesis is false
- In contrast, the p-value is the probability of declaring a false positive if the null hypothesis is true
![]()
POWER: If an effect exists, the chance that our study will find it
You can’t be right all the time
- It is impossible to completely eliminate all false positives and all false negatives
- The only way to be 100% certain you will never get a false positive is for your test to give 100% negative results
- But a pregnancy test that always says “You’re not pregnant” is completely useless!
- We have to find the sweet spot that reduces both false positives and false negatives to an acceptably low rate
Magic number: false positive rate
- Traditional paradigm of Western science is conservative
- Low probability of false positives, at the cost of a fairly high rate of false negatives
- Familiar \(p < 0.05\) comes from the significance level \(\alpha = 0.05\): 5% probability of a false positive
Magic number 2: false negative rate
- What about false negatives? Commonly we target 20% false negative rate, or \(\beta = 0.20\)
- \(\frac{0.20}{0.05}\) = 4:1 ratio of probability of a false negative to probability of a false positive
- No reason why we must use a 4:1 ratio