Testing stochastic software using pseudo-oracles

Stochastic models can be difficult to test due to their complexity and randomness, yet their predictions are often used to make important decisions, so they need to be correct. We introduce a new search-based technique for testing implementations of stochastic models by maximising the differences between the implementation and a pseudo-oracle. Our technique reduces testing effort and enables discrepancies to be found that might otherwise be overlooked. We show the technique can identify differences challenging for humans to observe, and use it to help a new user understand implementation differences in a real model of a citrus disease (Huanglongbing) used to inform policy and research.

[1]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[2]  Darko Marinov,et al.  Automated testing of refactoring engines , 2007, ESEC-FSE '07.

[3]  Christopher A. Gilligan,et al.  Optimising and Communicating Options for the Control of Invasive Plant Disease When There Is Epidemiological Uncertainty , 2015, PLoS Comput. Biol..

[4]  Andrew C. Rice,et al.  A Computational Science Agenda for Programming Language Research , 2014, ICCS.

[5]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[6]  Phil McMinn,et al.  Search-Based Software Testing: Past, Present and Future , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[7]  Gordon Fraser,et al.  1600 faults in 100 projects: automatically finding faults while achieving high coverage with EvoSuite , 2015, Empirical Software Engineering.

[8]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[9]  Les Carr,et al.  UK Research Software Survey 2014 , 2014 .

[10]  Paul Robert Harper,et al.  Modelling for the planning and management of bed capacities in hospitals , 2002, J. Oper. Res. Soc..

[11]  J. Bové,et al.  Huanglongbing: a destructive, newly-emerging, century-old disease of citrus [Asia; South Africa; Brazil; Florida] , 2006 .

[12]  Gail E. Kaiser,et al.  Properties of Machine Learning Applications for Use in Metamorphic Testing , 2008, SEKE.

[13]  Lu Zhang,et al.  Inner oracles: input-specific assertions on internal states , 2015, ESEC/SIGSOFT FSE.

[14]  W. M. McKeeman,et al.  Differential Testing for Software , 1998, Digit. Tech. J..

[15]  Mark Harman,et al.  The Oracle Problem in Software Testing: A Survey , 2015, IEEE Transactions on Software Engineering.

[16]  Tao Xie,et al.  Multiple-implementation testing for XACML implementations , 2008, TAV-WEB '08.

[17]  Alessandro Vespignani,et al.  Assessing the International Spreading Risk Associated with the 2014 West African Ebola Outbreak , 2014, PLoS currents.

[18]  Anne Auger,et al.  Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009 , 2010, GECCO '10.

[19]  Sonja Engmann Quantitative Methods Inquires 1 COMPARING DISTRIBUTIONS : THE TWO-SAMPLE ANDERSON-DARLING TEST AS AN ALTERNATIVE TO THE KOLMOGOROV-SMIRNOFF TEST , 2013 .

[20]  Nik J Cunniffe,et al.  Modeling when, where, and how to manage a forest epidemic, motivated by sudden oak death in California , 2016, Proceedings of the National Academy of Sciences.

[21]  Shin Yoo Metamorphic Testing of Stochastic Optimisation , 2010, 2010 Third International Conference on Software Testing, Verification, and Validation Workshops.

[22]  Gary McGraw,et al.  Generating Software Test Data by Evolution , 2001, IEEE Trans. Software Eng..

[23]  Alessandra Gorla,et al.  Cross-checking oracles from intrinsic software redundancy , 2014, ICSE.

[24]  Gavin J. Gibson,et al.  Bayesian inference for an emerging arboreal epidemic in the presence of control , 2014, Proceedings of the National Academy of Sciences.

[25]  Elaine J. Weyuker,et al.  On Testing Non-Testable Programs , 1982, Comput. J..

[26]  John A. Sokolowski,et al.  Modeling and Simulation Fundamentals: Theoretical Underpinnings and Practical Domains , 2010 .

[27]  Thomas Weise,et al.  Global Optimization Algorithms -- Theory and Application , 2009 .

[28]  T. Gottwald Current epidemiological understanding of citrus Huanglongbing . , 2010, Annual review of phytopathology.

[29]  D. Gillespie Approximate accelerated stochastic simulation of chemically reacting systems , 2001 .

[30]  Chao Liu,et al.  Statistical Debugging: A Hypothesis Testing-Based Approach , 2006, IEEE Transactions on Software Engineering.

[31]  Shah Jamal Alam,et al.  A Socio-Political and -Cultural Model of the War in Afghanistan1 , 2010 .

[32]  Mark Harman,et al.  AUSTIN: An open source tool for search based software testing of C programs , 2013, Inf. Softw. Technol..

[33]  Matt J. Keeling,et al.  Understanding the persistence of measles: reconciling theory, simulation and observation , 2002, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[34]  Z. Merali Computational science: ...Error , 2010, Nature.

[35]  Phil McMinn,et al.  Search-based failure discovery using testability transformations to generate pseudo-oracles , 2009, GECCO.

[36]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[37]  Andy Roberts,et al.  How Accurate Is Scientific Software? , 1994, IEEE Trans. Software Eng..

[38]  D. Darling,et al.  A Test of Goodness of Fit , 1954 .

[39]  Janice Singer,et al.  How do scientists develop and use scientific software? , 2009, 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering.

[40]  Joachim Wegener,et al.  Evolutionary test environment for automatic structural testing , 2001, Inf. Softw. Technol..

[41]  Diane Kelly,et al.  Mutation Sensitivity Testing , 2009, Computing in Science & Engineering.

[42]  Andrew T. Levin,et al.  The Evolution of Macro Models at the Federal Reserve Board , 1997 .