UvA-DARE ( Digital Academic Repository ) Researchers ’ Intuitions About Power in Psychological Research

Many psychology studies are statistically underpowered. In part, this may be because many researchers rely on intuition, rules of thumb, and prior practice (along with practical considerations) to determine the number of subjects to test. In Study 1, we surveyed 291 published research psychologists and found large discrepancies between their reports of their preferred amount of power and the actual power of their studies (calculated from their reported typical cell size, typical effect size, and acceptable alpha). Furthermore, in Study 2, 89% of the 214 respondents overestimated the power of specific research designs with a small expected effect size, and 95% underestimated the sample size needed to obtain .80 power for detecting a small effect. Neither researchers’ experience nor their knowledge predicted the bias in their self-reported power intuitions. Because many respondents reported that they based their sample sizes on rules of thumb or common practice in the field, we recommend that researchers conduct and report formal power analyses for their studies.

[1]  Timothy D. Wilson,et al.  Comment on “Estimating the reproducibility of psychological science” , 2016, Science.

[2]  Michèle B. Nuijten,et al.  The prevalence of statistical reporting errors in psychology (1985–2013) , 2015, Behavior research methods.

[3]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[4]  Stephane Champely,et al.  Basic Functions for Power Analysis , 2015 .

[5]  G. Costantini,et al.  Safeguard Power as a Protection Against Imprecise Power Estimates , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[6]  J. Wicherts,et al.  Outlier removal, sum scores, and the inflation of the Type I error rate in independent samples t tests: the power of alternatives and recommendations. , 2014, Psychological methods.

[7]  G. Francis The frequency of excess success for articles in Psychological Science , 2014, Psychonomic bulletin & review.

[8]  G. Keren,et al.  Belief in the Law of Small Numbers , 2014 .

[9]  Brian A. Nosek,et al.  Power failure: why small sample size undermines the reliability of neuroscience , 2013, Nature Reviews Neuroscience.

[10]  Brian A. Nosek,et al.  Recommendations for Increasing Replicability in Psychology † , 2013 .

[11]  Klaus Fiedler,et al.  The Long Way From α-Error Control to Validity Proper , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[12]  E. Wagenmakers,et al.  An Agenda for Purely Confirmatory Research , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[13]  J. Wicherts,et al.  The Rules of the Game Called Psychological Science , 2012, Perspectives on psychological science : a journal of the Association for Psychological Science.

[14]  G. Loewenstein,et al.  Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling , 2012, Psychological science.

[15]  J. Wicherts,et al.  The (mis)reporting of statistical results in psychology journals , 2011, Behavior research methods.

[16]  Jacob M. Marszalek,et al.  Sample Size in Psychological Research over the Past 30 Years , 2011, Perceptual and motor skills.

[17]  D. Fanelli “Positive” Results Increase Down the Hierarchy of the Sciences , 2010, PloS one.

[18]  Jeffrey N. Rouder,et al.  Bayesian t tests for accepting and rejecting the null hypothesis , 2009, Psychonomic bulletin & review.

[19]  S. Maxwell The persistence of underpowered studies in psychological research: causes, consequences, and remedies. , 2004, Psychological methods.

[20]  C. F. Bond,et al.  One Hundred Years of Social Psychology Quantitatively Described , 2003 .

[21]  L. Eyde,et al.  Psychological testing and psychological assessment. A review of evidence and issues. , 2001, The American psychologist.

[22]  R. Nickerson,et al.  Null hypothesis significance testing: a review of an old and continuing controversy. , 2000, Psychological methods.

[23]  James J. Lindsay,et al.  Research in the Psychological Laboratory , 1999 .

[24]  Theodor D. Sterling,et al.  Publication decisions revisited: the effect of the outcome of statistical tests on the decision to p , 1995 .

[25]  Mark W. Lipsey,et al.  The efficacy of psychological, educational, and behavioral treatment. Confirmation from meta-analysis. , 1993, The American psychologist.

[26]  G. Gigerenzer,et al.  Do studies of statistical power have an effect on the power of studies , 1989 .

[27]  R. Rosenthal The file drawer problem and tolerance for null results , 1979 .

[28]  K. Yuen,et al.  The two-sample trimmed t for unequal population variances , 1974 .

[29]  B. L. Welch THE SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO MEANS WHEN THE POPULATION VARIANCES ARE UNEQUAL , 1938 .

[30]  S. Maxwell,et al.  Is psychology suffering from a replication crisis? What does "failure to replicate" really mean? , 2015, The American psychologist.

[31]  Stacey A. Hancock Modern Statistics for the Social and Behavioral Sciences: A Practical Introduction , 2012 .

[32]  E. Wagenmakers,et al.  Journal of Personality and Social Psychology Why Psychologists Must Change the Way They Analyze Their Data : The Case of Psi : Comment on Bem ( 2011 ) , 2011 .

[33]  N. Epley,et al.  The Anchoring-and-Adjustment Heuristic the Adjustments Are Insufficient , 2006 .

[34]  G. Gigerenzer Mindless statistics , 2004 .

[35]  Jacob Cohen Things I Have Learned ( So Far ) , 2002 .

[36]  N. Roese,et al.  Applications of meta-analysis: 1987-1992 , 1994 .

[37]  A. Greenwald Consequences of Prejudice Against the Null Hypothesis , 1975 .