A Framework for Validation of Network-based Simulation Models: an Application to Modeling Interventions of Pandemics

Network-based computer simulation models are powerful tools for analyzing and guiding policy formation related to the actual systems being modeled. However, the inherent data and computationally intensive nature of this model class gives rise to fundamental challenges when it comes to executing typical experimental designs. In particular this applies to model validation. Manual management of the complex simulation work-flows along with the associated data will often require a broad combination of skills and expertise. Examples of skills include domain expertise, mathematical modeling, programming, high-performance computing, statistical designs, data management as well as the tracking all assets and instances involved. This is a complex and error-prone process for the best of practices, and even small slips may compromise model validation and reduce human productivity in significant ways. In this paper, we present a novel framework that addresses the challenges of model validation just mentioned. The components of our framework form an ecosystem consisting of (i) model unification through a standardized model configuration format, (ii) simulation data management, (iii) support for experimental designs, and (iv) methods for uncertainty quantification, and sensitivity analysis, all ultimately supporting the process of model validation. (Note that our view of validation is much more comprehensive than simply ensuring that the computational model can reproduce instance of historical data.) This is an extensible design where domain experts from e.g. experimental design can contribute to the collection of available algorithms and methods. Additionally, our solution directly supports reproducible computational experiments and analysis, which in turn facilitates independent model verification and validation. Finally, to showcase our design concept, we provide a sensitivity analysis for examining the consequences of different intervention strategies for an influenza pandemic.

[1]  Stefano Tarantola,et al.  A Quantitative Model-Independent Method for Global Sensitivity Analysis of Model Output , 1999, Technometrics.

[2]  I. Sobola,et al.  Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates , 2001 .

[3]  Emile H. L. Aarts,et al.  Simulated Annealing: Theory and Applications , 1987, Mathematics and Its Applications.

[4]  L. Felipe Perrone,et al.  SAFE: Simulation automation framework for experiments , 2012, Proceedings Title: Proceedings of the 2012 Winter Simulation Conference (WSC).

[5]  Laxmikant V. Kalé,et al.  Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[6]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[7]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[8]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[9]  Samarth Swarup,et al.  Modeling Urban Transportation in the Aftermath of a Nuclear Disaster: The Role of Human Behavioral Responses , 2013 .

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Maleq Khan,et al.  An integrated agent-based approach for modeling disease spread in large populations to support health informatics , 2016, 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[12]  Abhijin Adiga,et al.  Temporal Vaccination Games under Resource Constraints , 2016, AAAI.

[13]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[14]  Madhav V. Marathe,et al.  Indemics: an interactive data intensive framework for high performance epidemic simulation , 2010, ICS '10.

[15]  Henning S. Mortveit,et al.  A general framework for experimental design, uncertainty quantification and sensitivity analysis of computer simulation models , 2015, 2015 Winter Simulation Conference (WSC).

[16]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[17]  Max D. Morris,et al.  Factorial sampling plans for preliminary computational experiments , 1991 .

[18]  R. Plackett,et al.  THE DESIGN OF OPTIMUM MULTIFACTORIAL EXPERIMENTS , 1946 .

[19]  Madhav V. Marathe,et al.  EpiSimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networks , 2008, HiPC 2008.

[20]  Madhav V. Marathe,et al.  Simfrastructure: A Flexible and Adaptable Middleware Platform for Modeling and Analysis of Socially Coupled Systems , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[21]  Madhav V. Marathe,et al.  EpiFast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems , 2009, ICS.

[22]  Lih-Yuan Deng,et al.  Orthogonal Arrays: Theory and Applications , 1999, Technometrics.

[23]  Emanuele Borgonovo,et al.  A new uncertainty importance measure , 2007, Reliab. Eng. Syst. Saf..

[24]  Boxin Tang Orthogonal Array-Based Latin Hypercubes , 1993 .

[25]  Michael S. Eldred,et al.  DAKOTA , A Multilevel Parallel Object-Oriented Framework for Design Optimization , Parameter Estimation , Uncertainty Quantification , and Sensitivity Analysis Version 4 . 0 User ’ s Manual , 2006 .

[26]  G. Box,et al.  Some New Three Level Designs for the Study of Quantitative Variables , 1960 .

[27]  A. O'Hagan,et al.  Probabilistic sensitivity analysis of complex models: a Bayesian approach , 2004 .

[28]  A. Marathe,et al.  Same influenza vaccination strategies but different outcomes across US cities? , 2010, International journal of infectious diseases : IJID : official publication of the International Society for Infectious Diseases.

[29]  Setsuya Kurahashi,et al.  A Health Policy Simulation Model of Ebola Haemorrhagic Fever and Zika Fever , 2016, KES-AMSTA.

[30]  C. Macken,et al.  Modeling targeted layered containment of an influenza pandemic in the United States , 2008, Proceedings of the National Academy of Sciences.

[31]  Wei Gong,et al.  A GUI platform for uncertainty quantification of complex dynamical models , 2016, Environ. Model. Softw..

[32]  D. Higdon,et al.  Computer Model Calibration Using High-Dimensional Output , 2008 .

[33]  David J. DeWitt,et al.  Scientific data management in the coming decade , 2005, SGMD.

[34]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004 .

[35]  Madhav V. Marathe,et al.  EpiSimdemics: An efficient algorithm for simulating the spread of infectious disease over large realistic social networks , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[36]  Stefan Leye,et al.  Flexible experimentation in the modeling and simulation framework JAMES II - implications for computational systems biology , 2010, Briefings Bioinform..

[37]  Abhijin Adiga,et al.  Route Stability in Large-Scale Transportation Models , 2013 .

[38]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[39]  Wei Chen,et al.  Assessing the Reliability of Complex Models: Mathematical and Statistical Foundations of Verification, Validation, and Uncertainty Quantification , 2012 .

[40]  Jennifer Werfel,et al.  Orthogonal Arrays Theory And Applications , 2016 .

[41]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .