Ten Simple Rules for Reducing Overoptimistic Reporting in Methodological Computational Research

In most scientific fields, and in biomedical research in particular, there have long been many discussions on how to improve research practices and methods. The trend has increased in recent years, as illustrated by the series on “reducing waste,” published in The Lancet in January 2014 [1], or by the recent essay by John Ioannidis on how to make published results more true [2], which echoes his earlier provocative paper entitled “Why most published research findings are false” [3]. One of the important aspects underlying these discussions is that biomedical literature is most often overoptimistic with respect to, for example, the superiority of a new therapy or the strength of association between a risk factor and an outcome. Published results appear more significant, more spectacular, or sometimes more intuitive—in a word, more “satisfactory”—to authors and readers than they actually would if they reflected the truth. Causes of this problem are diverse, numerous, and interrelated. The effects of “fishing for significance” strategies or selective/incomplete reporting are exacerbated by design issues (e.g., small sample sizes, many investigated features) [3] or publication bias [4], to cite only a few of the factors at work. Research and guidelines on how to reduce overoptimistic reporting in the context of computational research, including computational biology as an important special case, however, are surprisingly scarce. Many methodological articles published in computational literature report the (vastly) superior performance of new methods [5], too often in general terms and—directly or indirectly—implying that the presented positive results are generalizable to other settings. Such overoptimistic reporting confuses readers, makes literature less credible and more difficult to interpret, and might even ultimately lead to a waste of resources in some cases. Here I take advantage of the popular “ten-simple-rules” format [6] to address the problem of overoptimistic reporting in methodological computational biology research, that is papers—termed “methodological papers” here—devoted primarily to the development and testing of new computational methods (intended to be used by other researchers on other data in the future) rather than to the biological question itself or the specific dataset at hand.

[1]  P. Easterbrook,et al.  Publication bias in clinical research , 1991, The Lancet.

[2]  Philip E. Bourne,et al.  Ten Simple Rules for Writing a PLOS Ten Simple Rules Article , 2014, PLoS Comput. Biol..

[3]  Alexander G. Fletcher,et al.  Ten Simple Rules for Effective Computational Research , 2014, PLoS Comput. Biol..

[4]  Weixiong Zhang,et al.  Ten Simple Rules for Writing Research Papers , 2014, PLoS Comput. Biol..

[5]  Philip E. Bourne,et al.  Ten Simple Rules for Better Figures , 2014, PLoS Comput. Biol..

[6]  Anton Nekrutenko,et al.  Ten Simple Rules for Reproducible Computational Research , 2013, PLoS Comput. Biol..

[7]  Anne-Laure Boulesteix,et al.  A Plea for Neutral Comparison Studies in Computational Sciences , 2012, PloS one.

[8]  D. Wolpert The Supervised Learning No-Free-Lunch Theorems , 2002 .

[9]  Philip E Bourne,et al.  Ten Simple Rules for Getting Published , 2005, PLoS Comput. Biol..

[10]  A. Boulesteix,et al.  A Statistical Framework for Hypothesis Testing in Real Data Comparison Studies , 2015 .

[11]  Published Online Biomedical research: increasing value, reducing waste , 2014 .

[12]  Anne-Laure Boulesteix,et al.  Over-optimism in bioinformatics: an illustration , 2010, Bioinform..

[13]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[14]  John P. A. Ioannidis,et al.  How to Make More Published Research True , 2014, PLoS medicine.

[15]  Joaquín Dopazo,et al.  Papers on normalization, variable selection, classification or clustering of microarray data , 2009, Bioinform..

[16]  Iveta Simera,et al.  EQUATOR: reporting guidelines for health research , 2008, Open medicine : a peer-reviewed, independent, open-access journal.

[17]  Anne-Laure Boulesteix,et al.  On representative and illustrative comparisons with real data in bioinformatics: response to the letter to the editor by Smith et al , 2013, Bioinform..

[18]  Edward R. Dougherty,et al.  Reporting bias when using real data sets to analyze classification performance , 2010, Bioinform..