Common Misconceptions about Data Analysis and Statistics

Ideally, any experienced investigator with the right tools should be able to reproduce a finding published in a peer-reviewed biomedical science journal. In fact, however, the reproducibility of a large percentage of published findings has been questioned. Undoubtedly, there are many reasons for this, but one reason may be that investigators fool themselves due to a poor understanding of statistical concepts. In particular, investigators often make these mistakes: 1) P-hacking, which is when you reanalyze a data set in many different ways, or perhaps reanalyze with additional replicates, until you get the result you want; 2) overemphasis on P values rather than on the actual size of the observed effect; 3) overuse of statistical hypothesis testing, and being seduced by the word “significant”; and 4) over-reliance on standard errors, which are often misunderstood.

[1]  Frank Bretz,et al.  Adaptive Trial Designs , 2011 .

[2]  C. Begley,et al.  Drug development: Raise standards for preclinical cancer research , 2012, Nature.

[3]  D. Mccloskey,et al.  The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives , 2008 .

[4]  F. Collins,et al.  Policy: NIH plans to enhance reproducibility , 2014, Nature.

[5]  W. K. Simmons,et al.  Circular analysis in systems neuroscience: the dangers of double dipping , 2009, Nature Neuroscience.

[6]  Harvey Motulsky Opinion: Never use the word ‘significant’ in a scientific paper , 2014 .

[7]  David Colquhoun,et al.  An investigation of the false discovery rate and the misinterpretation of p-values , 2014, Royal Society Open Science.

[8]  Leif D. Nelson,et al.  Data from Paper “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant” , 2014 .

[9]  C. Coffey,et al.  Adaptive trial designs: a review of barriers and opportunities , 2012, Trials.

[10]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[11]  Michael J Marino,et al.  The use and misuse of statistical methodologies in pharmacology research. , 2014, Biochemical pharmacology.

[12]  Jacob Cohen The earth is round (p < .05) , 1994 .

[13]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[14]  L. HARKing: Hypothesizing After the Results are Known , 2002 .

[15]  Donald A Berry,et al.  The difficult and ubiquitous problems of multiplicities , 2007, Pharmaceutical statistics.