Statistical Rigor and the Perils of Chance

Concerns about the reliability and reproducibility of biomedical research have been voiced across several arenas. In this commentary, I discuss how a poor appreciation of the role of chance in statistical inference contributes to this problem. In particular, how poor scientific design, such as low statistical power, and questionable research practices, such as post hoc hypothesizing and undisclosed flexibility in analyses, yield a high proportion of false-positive results. I discuss how the current publication and funding system perpetuates this poor practice by rewarding positive, yet often unreliable, results over rigorous methods. I conclude by discussing how scientists can prevent being fooled by chance findings by adopting well established, but often ignored, methodological best-practice. There is increasing awareness of the problem of unreliable findings across biomedical sciences (Ioannidis, 2005). Many “landmark” findings could not be replicated (Scott et al., 2008; Begley and Ellis, 2012; Steward et al., 2012) and many promising preclinical findings have failed to translate into clinical application (Perel et al., 2007; Prinz et al., 2011), leading many to question whether science is broken (Economist,2013). Central to this problem is a poor appreciation of the role of chance in the scientific process. As neuroscience has developed over the past 50 years, many of the large, easily observable effects have been found, and the field is likely pursuing smaller and more subtle effects. The corresponding growth in computational capabilities (Moore, 1998) means that researchers can run numerous tests on a single dataset in a matter of minutes. The human brain processes randomness poorly, and the huge potential for undisclosed analytical flexibility in modern data-management packages leaves researchers increasingly vulnerable to being fooled by chance. Researchers cannot measure an entire population of interest, so they take samples and use statistical inference to determine the probability that the results …

[1]  D. Fanelli “Positive” Results Increase Down the Hierarchy of the Sciences , 2010, PloS one.

[2]  Brian A. Nosek,et al.  Power failure: why small sample size undermines the reliability of neuroscience , 2013, Nature Reviews Neuroscience.

[3]  C. Begley,et al.  Drug development: Raise standards for preclinical cancer research , 2012, Nature.

[4]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[5]  D. Rennie CONSORT revised--improving the reporting of randomized trials. , 2001, JAMA.

[6]  I. Cuthill,et al.  Reporting : The ARRIVE Guidelines for Reporting Animal Research , 2010 .

[7]  Leif D. Nelson,et al.  False-Positive Psychology , 2011, Psychological science.

[8]  N. Taleb Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets , 2001 .

[9]  G. Loewenstein,et al.  Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling , 2012, Psychological science.

[10]  O. Steward,et al.  Replication and reproducibility in spinal cord injury research , 2012, Experimental Neurology.

[11]  P. Sandercock,et al.  Comparison of treatment effects between animal experiments and clinical trials: systematic review , 2006, BMJ : British Medical Journal.

[12]  Sally Melissa Gainsbury Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets , 2010, International Gambling Studies.

[13]  Jonathan A C Sterne,et al.  Sifting the evidence—what's wrong with significance tests? , 2001, BMJ : British Medical Journal.

[14]  Anna Clark,et al.  Preventing the ends from justifying the means: withholding results to address publication bias in peer-review , 2016, BMC Psychology.

[15]  C. Chambers Registered Reports: A new publishing initiative at Cortex , 2013, Cortex.

[16]  D. Moher,et al.  Transparent and accurate reporting increases reliability, utility, and impact of your research: reporting guidelines and the EQUATOR Network , 2010, BMC medicine.

[17]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[18]  Ambuj Kumar,et al.  Medical research: Trial unpredictability yields predictable therapy gains , 2013, Nature.

[19]  R. Nuzzo How scientists fool themselves – and how they can stop , 2015, Nature.

[20]  Kay Dickersin,et al.  The evolution of trial registries and their use to assess the clinical trial enterprise. , 2012, JAMA.

[21]  M. Munafo,et al.  The genetic architecture of psychophysiological phenotypes , 2014, Psychophysiology.

[22]  D. Fanelli Do Pressures to Publish Increase Scientists' Bias? An Empirical Support from US States Data , 2010, PloS one.

[23]  Brian A. Nosek,et al.  Power failure: why small sample size undermines the reliability of neuroscience , 2013, Nature Reviews Neuroscience.

[24]  E. Wagenmakers,et al.  Scientific rigor and the art of motorcycle maintenance , 2014, Nature Biotechnology.

[25]  J. Carlin,et al.  Beyond Power Calculations , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[26]  M. Munafo,et al.  Bias in genetic association studies and impact factor , 2009, Molecular Psychiatry.

[27]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[28]  Maximiliaan Schillebeeckx,et al.  The missing piece to changing the university culture , 2013, Nature Biotechnology.

[29]  S. Lazic,et al.  A call for transparent reporting to optimize the predictive value of preclinical research , 2012, Nature.

[30]  G.E. Moore,et al.  Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.

[31]  U. Dirnagl,et al.  Evidence for the Efficacy of NXY-059 in Experimental Focal Cerebral Ischaemia Is Confounded by Study Quality , 2008, Stroke.

[32]  J. E. Kranz,et al.  Design, power, and interpretation of studies in the standard murine model of ALS , 2008, Amyotrophic lateral sclerosis : official publication of the World Federation of Neurology Research Group on Motor Neuron Diseases.

[33]  David van Dijk,et al.  Publication metrics and success on the academic job market , 2014, Current Biology.

[34]  Andrei Cimpian,et al.  The pipeline project : Pre-publication independent replications of a single laboratory's research pipeline , 2016 .

[35]  R. Rosenthal The file drawer problem and tolerance for null results , 1979 .