Testing scientific software: A systematic literature review

CONTEXT Scientific software plays an important role in critical decision making, for example making weather predictions based on climate models, and computation of evidence for research publications. Recently, scientists have had to retract publications due to errors caused by software faults. Systematic testing can identify such faults in code. OBJECTIVE This study aims to identify specific challenges, proposed solutions, and unsolved problems faced when testing scientific software. METHOD We conducted a systematic literature survey to identify and analyze relevant literature. We identified 62 studies that provided relevant information about testing scientific software. RESULTS We found that challenges faced when testing scientific software fall into two main categories: (1) testing challenges that occur due to characteristics of scientific software such as oracle problems and (2) testing challenges that occur due to cultural differences between scientists and the software engineering community such as viewing the code and the model that it implements as inseparable entities. In addition, we identified methods to potentially overcome these challenges and their limitations. Finally we describe unsolved challenges and how software engineering researchers and practitioners can help to overcome them. CONCLUSIONS Scientific software presents special challenges for testing. Specifically, cultural differences between scientist developers and software engineers, along with the characteristics of the scientific software make testing more difficult. Existing techniques such as code clone detection can help to improve the testing process. Software engineers should consider special challenges posed by scientific software such as oracle problems when developing testing techniques.

[1]  David W. Kane,et al.  Agile methods in biomedical software development: a multi-site experience report , 2006, BMC Bioinformatics.

[2]  Tamara Dahlgren,et al.  Performance-Driven Interface Contract Enforcement for Scientific Components , 2007, CBSE.

[3]  Judith Segal,et al.  Some challenges facing software engineers developing software for scientists , 2009, 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering.

[4]  Guenther Ruhe,et al.  Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA , 2007, ESEM 2007.

[5]  Insup Lee,et al.  On effective testing of health care simulation software , 2011, SEHC '11.

[6]  C Greenough,et al.  A Survey of Software Testing Tools for Computational Science , 2007 .

[7]  Shayne Flint,et al.  A survey of scientific software development , 2010, ESEM '10.

[8]  Johannes Mayer,et al.  On Random Testing of Image Processing Applications , 2006, 2006 Sixth International Conference on Quality Software (QSIC'06).

[9]  Diane Kelly,et al.  Examining random and designed tests to detect code mistakes in scientific software , 2011, J. Comput. Sci..

[10]  Huai Liu,et al.  An innovative approach for testing bioinformatics programs using metamorphic testing , 2009, BMC Bioinformatics.

[11]  L. Futcher,et al.  IFIP – The International Federation for Information Processing , 2004 .

[12]  Diane Kelly,et al.  Dealing with Risk in Scientific Software Development , 2008, IEEE Software.

[13]  Luciano Baresi,et al.  An Introduction to Software Testing , 2006, FoVMT.

[14]  William L. Kleb,et al.  Exploring XP for Scientific Research , 2003, IEEE Softw..

[15]  Fernand Gobet,et al.  A theory-driven testing methodology for developing scientific software , 2012, J. Exp. Theor. Artif. Intell..

[16]  Michael A. Heroux,et al.  Improving the Development Process for CSE Software , 2007, 15th EUROMICRO International Conference on Parallel, Distributed and Network-Based Processing (PDP'07).

[17]  Tore Dybå,et al.  Applying Systematic Reviews to Diverse Study Types: An Experience Report , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[18]  Johannes Mayer,et al.  On Testing Image Processing Applications with Statistical Methods , 2005, Software Engineering.

[19]  Christian Murphy,et al.  Parameterizing random test data according to equivalence classes , 2007, RT '07.

[20]  Andy Roberts,et al.  How Accurate Is Scientific Software? , 1994, IEEE Trans. Software Eng..

[21]  James M. Bieman,et al.  Techniques for testing scientific programs without an oracle , 2013, 2013 5th International Workshop on Software Engineering for Computational Science and Engineering (SE-CSE).

[22]  Judith Segal,et al.  Some Challenges Facing Scientific Software Developers: The Case of Molecular Biology , 2009, 2009 Fifth IEEE International Conference on e-Science.

[24]  Philip W. Jones,et al.  Overview of the Software Design of the Community Climate System Model , 2005, Int. J. High Perform. Comput. Appl..

[25]  Steve M. Easterbrook,et al.  Engineering the Software for Understanding Climate Change , 2009, Computing in Science & Engineering.

[26]  Eliane Martins,et al.  Specification-guided Golden Run for Analysis of Robustness Testing Results , 2012, 2012 IEEE Sixth International Conference on Software Security and Reliability.

[27]  Michael A. Heroux,et al.  Barely sufficient software engineering: 10 practices to improve your CSE software , 2009, 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering.

[28]  Shin Yoo Metamorphic Testing of Stochastic Optimisation , 2010, 2010 Third International Conference on Software Testing, Verification, and Validation Workshops.

[29]  Jeffrey C. Carver,et al.  Software Development Environments for Scientific and Engineering Software: A Series of Case Studies , 2007, 29th International Conference on Software Engineering (ICSE'07).

[30]  Les Hatton,et al.  The T-experiments: errors in scientific software , 1996, Quality of Numerical Software.

[31]  Wasif Afzal,et al.  A systematic review of search-based testing for non-functional system properties , 2009, Inf. Softw. Technol..

[32]  David N. Card,et al.  The need for a rigorous development and testing methodology for medical software , 1988, Proceedings of the Symposium on the Engineering of Computer-Based Medical.

[33]  Gail E. Kaiser,et al.  Properties of Machine Learning Applications for Use in Metamorphic Testing , 2008, SEKE.

[34]  Geoff R. Mant,et al.  Scientific Software Development at a Research Facility , 2008, IEEE Software.

[35]  David A. Ham,et al.  Automated continuous verification for numerical simulation , 2011 .

[36]  Judith Segal,et al.  Models of scientific software development , 2008, CSE 2008.

[37]  TorkarRichard,et al.  A systematic review of search-based testing for non-functional system properties , 2009 .

[38]  Alvaro José Abackerli,et al.  A case study on testing CMM uncertainty simulation software (VCMM) , 2010 .

[39]  Janice Singer,et al.  How do scientists develop and use scientific software? , 2009, 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering.

[40]  Robert L. Kelsey,et al.  Creating a flexible environment for testing scientific software , 2004, SPIE Defense + Commercial Sensing.

[41]  Jeffrey C. Carver,et al.  What Scientists and Engineers Think They Know About Software Engineering : A Survey , 2011 .

[42]  Premkumar T. Devanbu,et al.  Improving scientific software component quality through assertions , 2005, SE-HPCS '05.

[43]  Per Runeson,et al.  A systematic review on regression test selection techniques , 2010, Inf. Softw. Technol..

[44]  Diane Kelly,et al.  Five Recommended Practices for Computational Scientists Who Write Software , 2009, Computing in Science & Engineering.

[45]  Judith Segal Software Development Cultures and Cooperation Problems: A Field Study of the Early Stages of Development of Software for a Scientific Community , 2009, Computer Supported Cooperative Work (CSCW).

[46]  Douglass E. Post,et al.  Software Project Management and Quality Engineering Practices for Complex, Coupled Multiphysics, Massively Parallel Computational Simulations: Lessons Learned From ASCI , 2004, Int. J. High Perform. Comput. Appl..

[47]  Diane Kelly,et al.  Testing for trustworthiness in scientific software , 2009, 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering.

[48]  Gail E. Kaiser,et al.  Using JML Runtime Assertion Checking to Automate Metamorphic Testing in Applications without Test Oracles , 2009, 2009 International Conference on Software Testing Verification and Validation.

[49]  Baowen Xu,et al.  Testing and validating machine learning classifiers by metamorphic testing , 2011, J. Syst. Softw..

[50]  Victor R. Basili,et al.  The ASC-Alliance Projects: A Case Study of Large-Scale Parallel Scientific Code Development , 2008, Computer.

[51]  Steve M. Easterbrook,et al.  Climate change: a grand software challenge , 2010, FoSER '10.

[52]  Tsong Yueh Chen,et al.  Metamorphic testing of programs on partial differential equations: a case study , 2002, Proceedings 26th Annual International Computer Software and Applications.

[53]  Dietmar Pfahl,et al.  What Do We Know about Scientific Software Development's Agile Practices? , 2012, Computing in Science & Engineering.

[54]  James M. Bieman,et al.  Using machine learning techniques to detect metamorphic relations for programs without test oracles , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[55]  Karl Meinke,et al.  A Learning-Based Approach to Unit Testing of Numerical Software , 2010, ICTSS.

[56]  Judith Segal,et al.  Scientists and Software Engineers: A Tale of Two Cultures , 2008, PPIG.

[57]  M. G. Cox,et al.  Design and use of reference data sets for testing scientific software , 1999 .

[58]  Johannes Mayer,et al.  Statistical Metamorphic Testing Testing Programs with Random Output by Means of Statistical Hypothesis Tests and Metamorphic Testing , 2007, Seventh International Conference on Quality Software (QSIC 2007).

[59]  Jesse H. Poore,et al.  Modeling Input Space for Testing Scientific Computational Software: A Case Study , 2008, ICCS.

[60]  Chris Murphy,et al.  An Approach to Software Testing of Machine Learning Applications , 2007, SEKE.

[61]  Pras Pathmanathan,et al.  Chaste: using agile programming techniques to develop computational biology software , 2008, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[62]  Nancy G. Leveson,et al.  Analysis of Faults in an N-Version Software Experiment , 1990, IEEE Trans. Software Eng..

[63]  Paul F. Dubois Testing Scientific Programs , 2012, Computing in Science & Engineering.

[64]  S. Thorsteinson,et al.  Scientific Software Testing: Analysis with Four Dimensions , 2011, IEEE Software.

[65]  S. L. Eddins,et al.  Automated Software Testing for Matlab , 2009, Computing in Science & Engineering.

[66]  Judith Segal Some Problems of Professional End User Developers , 2007 .

[67]  Barbara Paech,et al.  System Testing a Scientific Framework Using a Regression-Test Environment , 2012, Computing in Science & Engineering.

[68]  Elaine J. Weyuker,et al.  Pseudo-oracles for non-testable programs , 1981, ACM '81.

[69]  Konstantin Kreyman,et al.  Inspection Procedures for Critical Programs that Model Physical Phenomena , 2001 .

[70]  Thomas L. Clune,et al.  Software Testing and Verification in Climate Model Development , 2011 .

[71]  Elaine J. Weyuker,et al.  On Testing Non-Testable Programs , 1982, Comput. J..

[72]  Brian T. Smith A Test Harness TH for Numerical Applications and Libraries , 2006, Grid-Based Problem Solving Environments.

[73]  Jon Pipitone,et al.  Assessing climate model software quality: a defect density analysis of three models , 2012 .

[74]  Barbara Kitchenham,et al.  Procedures for Performing Systematic Reviews , 2004 .

[75]  Jeffrey C. Carver,et al.  A systematic literature review to identify and classify software requirement errors , 2009, Inf. Softw. Technol..

[76]  Judith Segal,et al.  When Software Engineers Met Research Scientists: A Case Study , 2005, Empirical Software Engineering.

[77]  Arnaud Gotlieb,et al.  Symbolic Path-Oriented Test Data Generation for Floating-Point Programs , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[78]  Howard Margolis,et al.  Dealing with risk , 1996 .

[79]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[80]  Greg Miller,et al.  A Scientist's Nightmare: Software Problem Leads to Five Retractions , 2006, Science.

[81]  Diane Kelly,et al.  Software Engineering for Scientists , 2011, Comput. Sci. Eng..