Current sample size conventions: Flaws, harms, and alternatives

BackgroundThe belief remains widespread that medical research studies must have statistical power of at least 80% in order to be scientifically sound, and peer reviewers often question whether power is high enough.DiscussionThis requirement and the methods for meeting it have severe flaws. Notably, the true nature of how sample size influences a study's projected scientific or practical value precludes any meaningful blanket designation of <80% power as "inadequate". In addition, standard calculations are inherently unreliable, and focusing only on power neglects a completed study's most important results: estimates and confidence intervals. Current conventions harm the research process in many ways: promoting misinterpretation of completed studies, eroding scientific integrity, giving reviewers arbitrary power, inhibiting innovation, perverting ethical standards, wasting effort, and wasting money. Medical research would benefit from alternative approaches, including established value of information methods, simple choices based on cost or feasibility that have recently been justified, sensitivity analyses that examine a meaningful array of possible findings, and following previous analogous studies. To promote more rational approaches, research training should cover the issues presented here, peer reviewers should be extremely careful before raising issues of "inadequate" sample size, and reports of completed studies should not discuss power.SummaryCommon conventions and expectations concerning sample size are deeply flawed, cause serious harm to the research process, and should be replaced by more rational alternatives.

[1]  Stephen J Senn,et al.  Power is indeed irrelevant in interpreting completed studies , 2002, BMJ : British Medical Journal.

[2]  D. Heisey,et al.  The Abuse of Power , 2001 .

[3]  Peter Bacchetti,et al.  Peer review of statistics in medical research: the other problem , 2002, BMJ : British Medical Journal.

[4]  S. Goodman,et al.  p values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate. , 1993, American journal of epidemiology.

[5]  Invited commentary: ethics and sample size--another view. , 2005, American journal of epidemiology.

[6]  J. Manson,et al.  Low-fat dietary pattern and risk of invasive breast cancer: the Women's Health Initiative Randomized Controlled Dietary Modification Trial. , 2006, JAMA.

[7]  Caroline A Crowther,et al.  Spontaneous preterm delivery in primiparous women at low risk in Denmark: population based study , 2006, BMJ : British Medical Journal.

[8]  A S Detsky,et al.  Using cost-effectiveness analysis to improve the efficiency of allocating funds to clinical trials. , 1990, Statistics in medicine.

[9]  Gordon H Guyatt,et al.  In the Era of Systematic Reviews, Does the Size of an Individual Trial Still Matter? , 2008, PLoS medicine.

[10]  C. McCulloch,et al.  Bacchetti et al. Respond to “Ethics and Sample Size—Another View” , 2005 .

[11]  D. Horrobin Are large clinical trials in rapidly lethal diseases usually unethical? , 2003, The Lancet.

[12]  D. Moher,et al.  The Revised CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration , 2001, Annals of Internal Medicine.

[13]  J. Tukey Tightening the clinical trial. , 1993, Controlled clinical trials.

[14]  J. Manson,et al.  Low-fat dietary pattern and risk of colorectal cancer: the Women's Health Initiative Randomized Controlled Dietary Modification Trial. , 2006, JAMA.

[15]  Pre-publication history , 2001 .

[16]  K. Schulz,et al.  Sample size calculations in randomised trials: mandatory and mystical , 2005, The Lancet.

[17]  A. Vail Experiences of a biostatistician on a U.K. Research Ethics Committee. , 1998, Statistics in medicine.

[18]  N. Breslow Are Statistical Contributions to Medicine Undervalued? , 2003, Biometrics.

[19]  H. Kraemer,et al.  Caution regarding the use of pilot studies to guide power calculations for study proposals. , 2006, Archives of general psychiatry.

[20]  D. Moher,et al.  CONSORT statement requires closer examination , 2002, BMJ : British Medical Journal.

[21]  S. Goodman,et al.  The Use of Predicted Confidence Intervals When Planning Experiments and the Misuse of Power When Interpreting Results , 1994, Annals of Internal Medicine.

[22]  J. Matthews,et al.  Small clinical trials: are they all bad? , 1995, Statistics in medicine.

[23]  Andrew J Vickers,et al.  Underpowering in randomized trials reporting a sample size calculation. , 2003, Journal of clinical epidemiology.

[24]  Stephen Senn,et al.  Statistical Issues in Drug Development , 1997 .

[25]  C. McCulloch,et al.  Ethics and sample size. , 2005, American journal of epidemiology.

[26]  M. Gardner,et al.  Confidence intervals rather than P values: estimation rather than hypothesis testing. , 1986, British medical journal.

[27]  Mark R. Segal,et al.  Rejoinder for Discussions of "Simple, Defensible Sample Sizes Based on Cost Efficiency" , 2008 .

[28]  Andrew R Willan,et al.  The value of information and optimal clinical trial design , 2005, Statistics in medicine.

[29]  M. Zwarenstein Peer review of statistics in medical research , 2002, BMJ : British Medical Journal.

[30]  J. Manson,et al.  The Women’s Health Initiative Randomized Controlled Dietary Modification Trial , 2022 .

[31]  Philippe Ravaud,et al.  Reporting of sample size calculation in randomised controlled trials: review , 2009, BMJ : British Medical Journal.

[32]  J. Karlawish,et al.  The continuing unethical conduct of underpowered clinical trials. , 2002, JAMA.

[33]  D. Botstein,et al.  Proposed Changes for NIH's Center for Scientific Review , 1999, Science.

[34]  David Moher,et al.  Peer review of statistics in medical research. Reporting power calculations is important. , 2002, BMJ.

[35]  David Braunholtz,et al.  Why “underpowered” trials are not necessarily unethical , 1997, The Lancet.

[36]  M. Segal,et al.  Simple, Defensible Sample Sizes Based on Cost Efficiency , 2008, Biometrics.

[37]  Andrew R Willan,et al.  Optimal sample size determinations from an industry perspective based on the expected value of information , 2008, Clinical trials.

[38]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.