Power Analysis Tutorial

Why learn power analysis?

  • The single most-requested topic for SEA stats training
  • Many scientists are required to do power analysis and view it as an annoyance
  • But power analysis is critical when designing studies (experimental or observational) that give us reliable answers to questions we care about
  • If we can’t do that, we can’t do science

Motivation to learn power analysis

When done properly, power analysis:

  • Helps you design your study, foreseeing any issues that might arise
  • Reduces the chance of doing weak and inconclusive studies that are doomed before they even start
  • Elevates your science, and your science makes the world a better place

If that doesn’t motivate you to learn about power analysis, I don’t know what will!

Prerequisites

You will get the most out of this course if you …

  • Know the basic principles of experimental design (randomization and replication)
  • Are familiar with statistical tests and models (t-test, analysis of variance, mixed models)
  • Know a little bit about data analysis using R

How to follow the course

  • Slides and text version of lessons are online
  • Fill in code in the worksheet (replace ... with code)
  • You can always copy and paste code from text version of lesson if you fall behind

At the end of this course, you will understand …

  • What statistical power is and why it’s important
  • Why we have to trade off between false positives and false negatives
  • Why power depends on sample size
  • What effect size is and why statistical power depends on it
  • That doing a power analysis after you’ve collected the data is pointless
  • That the quality and realism of a power analysis is better, the more work you put into it

At the end of this course, you will be able to …

  • Extract effect sizes from published papers or previous studies
  • Calculate exact statistical power for simple study designs
  • Estimate statistical power using simulation for study designs that require mixed models

Power basics

A framework for statistical power

  • We are working within the classical statistical framework of using evidence from the data to try to reject a null hypothesis
  • We can never be 100% certain of the truth without measuring every individual in the (sometimes hypothetical) population
  • We might be wrong because of natural variation, measurement error, biases in our sample, or any/all of those

Right and wrong in different ways

  • Assume for now we are trying to determine the truth about a binary (yes or no) outcome
  • You can be right or wrong in different ways, depending on what the truth is
    • False positive = Type I error
    • False negative = Type II error

Two ways to be wrong

  • \(Y\) is the truth, \(\hat{Y}\) is the model’s prediction

Definition of statistical power

  • Power of a statistical test: the probability that the test will detect a phenomenon if the phenomenon is true
  • Follows directly from the idea of true and false positives and negatives
  • Power is the probability of declaring a true positive if the null hypothesis is false
  • In contrast, the p-value is the probability of declaring a false positive if the null hypothesis is true

POWER: If an effect exists, the chance that our study will find it

You can’t be right all the time

  • It is impossible to completely eliminate all false positives and all false negatives
  • The only way to be 100% certain you will never get a false positive is for your test to give 100% negative results
  • But a pregnancy test that always says “You’re not pregnant” is completely useless!
  • We have to find the sweet spot that reduces both false positives and false negatives to an acceptably low rate

Magic number: false positive rate

  • Traditional paradigm of Western science is conservative
  • Low probability of false positives, at the cost of a fairly high rate of false negatives
  • Familiar \(p < 0.05\) comes from the significance level \(\alpha = 0.05\): 5% probability of a false positive

Magic number 2: false negative rate

  • What about false negatives? Commonly we target 20% false negative rate, or \(\beta = 0.20\)
  • \(\frac{0.20}{0.05}\) = 4:1 ratio of probability of a false negative to probability of a false positive
  • No reason why we must use a 4:1 ratio