The Importance of Statistical Power Calculations

When planning research, collecting data, or interpreting an analysis, a crucial research design consideration is the size of the study sample. Typically, all of the individuals in a certain population of interest cannot be evaluated, so a researcher must rely on examining a sample of individuals to make inferences about the population they represent. For example, to understand the benefit of a new drug to prevent migraine, a researcher cannot possibly give the drug or a placebo to all individuals in the world who have migraine. Instead, the researcher will evaluate a sample of individuals with migraine and give the drug or placebo to everyone in this sample. Still, how large does this sample need to be before we can claim to know something about how it works in the population? The answer to this question can be found in statistical power calculations. Statistical power calculations are an integral aspect of determining the size of a sample in evaluating research using null hypothesis testing. As outlined below, when designing a study, an effect size is postulated to exist (eg, a 20% group difference in treatment response), and a study is designed to examine this effect size in relation to a null hypothesis that usually depicts the absence of an effect (eg, a 0% treatment response). Using this effect size and several other analytical assumptions, investigators can calculate a sample size that will allow them to have a good chance of finding statistically significant differences in their sample, even with sampling error that is always encountered when a sample is drawn from a population. For example, the investigators think that their new drug will have a 20% group difference in the population, but due to sampling error, any individual sample drawn from that population likely will exhibit subtle differences from this expected rate. A statistical power calculation takes this into consideration to inform a researcher on the needed sample size to obtain a suitable chance of detecting a hypothesized effect size.