Troubling Trends in Scientific Software Use

"Blind trust" is dangerous when choosing software to support research. Software pervades every domain of science (1–3), perhaps nowhere more decisively than in modeling. In key scientific areas of great societal importance, models and the software that implement them define both how science is done and what science is done (4, 5). Across all science, this dependence has led to concerns around the need for open access to software (6, 7), centered on the reproducibility of research (1, 8–10). From fields such as high-performance computing, we learn key insights and best practices for how to develop, standardize, and implement software (11). Open and systematic approaches to the development of software are essential for all sciences. But for many scientists this is not sufficient. We describe problems with the adoption and use of scientific software.

[1]  Darrel C. Ince,et al.  The case for open computer programs , 2012, Nature.

[2]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[3]  Greg Wilson How Do Scientists Really Use Computers , 2009 .

[4]  Trevor Hastie,et al.  A statistical explanation of MaxEnt for ecologists , 2011 .

[5]  Egon L. Willighagen,et al.  Changing computational research. The challenges ahead , 2012, Source Code for Biology and Medicine.

[6]  Andy Roberts,et al.  How Accurate Is Scientific Software? , 1994, IEEE Trans. Software Eng..

[7]  Jennifer M. Urban,et al.  Shining Light into Black Boxes , 2012, Science.

[8]  Gregory J. Wilson,et al.  Where’s the Real Bottleneck in Scientific Computing? , 2006 .

[9]  D. Warton,et al.  Equivalence of MAXENT and Poisson Point Process Models for Species Distribution Modeling in Ecology , 2013, Biometrics.

[10]  Greg Miller,et al.  A Scientist's Nightmare: Software Problem Leads to Five Retractions , 2006, Science.

[11]  Jacquelyn S. Fetrow,et al.  Scientific Software Development Is Not an Oxymoron , 2006, PLoS Comput. Biol..

[12]  Robert P. Anderson,et al.  Maximum entropy modeling of species geographic distributions , 2006 .

[13]  Gary S Collins,et al.  Interpreting diagnostic accuracy studies for patient care , 2012, BMJ : British Medical Journal.

[14]  Jeffrey C. Carver,et al.  Understanding the High-Performance-Computing Community: A Software Engineer's Perspective , 2008, IEEE Software.

[15]  Z. Merali Computational science: ...Error , 2010, Nature.

[16]  Judith Segal,et al.  Scientific End-User Developers and Barriers to User/Customer Engagement , 2011, J. Organ. End User Comput..

[17]  J. Elith,et al.  Species Distribution Models: Ecological Explanation and Prediction Across Space and Time , 2009 .