Model-independent comparison of simulation output

Abstract Computational models of complex systems are usually elaborate and sensitive to implementation details, characteristics which often affect their verification and validation. Model replication is a possible solution to this issue. It avoids biases associated with the language or toolkit used to develop the original model, not only promoting its verification and validation, but also fostering the credibility of the underlying conceptual model. However, different model implementations must be compared to assess their equivalence. The problem is, given two or more implementations of a stochastic model, how to prove that they display similar behavior? In this paper, we present a model comparison technique, which uses principal component analysis to convert simulation output into a set of linearly uncorrelated statistical measures, analyzable in a consistent, model-independent fashion. It is appropriate for ascertaining distributional equivalence of a model replication with its original implementation. Besides model-independence, this technique has three other desirable properties: a) it automatically selects output features that best explain implementation differences; b) it does not depend on the distributional properties of simulation output; and, c) it simplifies the modelers’ work, as it can be used directly on simulation outputs. The proposed technique is shown to produce similar results to the manual or empirical selection of output features when applied to a well-studied reference model.

[1]  Kathleen M. Carley Computational organizational science and organizational engineering , 2002, Simul. Model. Pract. Theory.

[2]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[3]  Susan M. Sanchez,et al.  Output modeling: abc's of output analysis , 1999, WSC '01.

[4]  J. Royston An Extension of Shapiro and Wilk's W Test for Normality to Large Samples , 1982 .

[5]  Agostinho C. Rosa,et al.  SimOutUtils - Utilities for analyzing time series simulation output , 2016 .

[6]  Dan Miodownik,et al.  Between Replication and Docking: "Adaptive Agents, Political Institutions, and Civic Traditions" Revisited , 2010, J. Artif. Soc. Soc. Simul..

[7]  Pietro Terna,et al.  Horizontal and Vertical Multiple Implementations in a Model of Industrial Districts , 2008, J. Artif. Soc. Soc. Simul..

[8]  Robert L. Axtell,et al.  Aligning simulation models: A case study and results , 1996, Comput. Math. Organ. Theory.

[9]  William Rand,et al.  Making Models Match: Replicating an Agent-Based Model , 2007, J. Artif. Soc. Soc. Simul..

[10]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[11]  Jean Dickinson Gibbons,et al.  Nonparametric Statistical Inference. 2nd Edition. , 1986 .

[12]  G An,et al.  Agent-based computer simulation and sirs: building a bridge between basic science and clinical trials. , 2001, Shock.

[13]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[14]  Mamadou Kaba Traoré,et al.  Distribution of random streams for simulation practitioners , 2013, Concurr. Comput. Pract. Exp..

[15]  Douglas C. Montgomery,et al.  Applied Statistics and Probability for Engineers, Third edition , 1994 .

[16]  Subhabrata Chakraborti,et al.  Nonparametric Statistical Inference , 2011, International Encyclopedia of Statistical Science.

[17]  Eric R. Ziegel,et al.  Engineering Statistics , 2004, Technometrics.

[18]  B. Tabachnick,et al.  Using Multivariate Statistics (5th Edition) , 2006 .

[19]  Roshan M. D'Souza,et al.  Data-parallel techniques for simulating a mega-scale agent-based model of systemic inflammatory response syndrome on graphics processing units , 2012, Simul..

[20]  Anastasios Xepapadeas,et al.  Modeling Complex Systems , 2010 .

[21]  Hazel R. Parry,et al.  Large Scale Agent-Based Modelling: A Review and Guidelines for Model Scaling , 2012 .

[22]  Donald B. Rubin,et al.  Ensemble-Adjusted p Values , 1983 .

[23]  C. A. Boneau,et al.  The effects of violations of assumptions underlying the test. , 1960, Psychological bulletin.

[24]  Agostinho C. Rosa,et al.  A template model for agent-based simulations , 2015, PeerJ Prepr..

[25]  Forrest Stonedahl,et al.  The Complexities of Agent-Based Modeling Output Analysis , 2015, J. Artif. Soc. Soc. Simul..

[26]  K. R. Clarke,et al.  Non‐parametric multivariate analyses of changes in community structure , 1993 .

[27]  N. David Validating Simulations , 2014 .

[28]  Robert G. Sargent,et al.  A New Statistical Procedure for Validation of Simulation and Stochastic Models , 2010 .

[29]  Yaneer Bar-Yam,et al.  Dynamics Of Complex Systems , 2019 .

[30]  Patrick Taillandier,et al.  Standardised and transparent model descriptions for agent-based models: Current status and prospects , 2014, Environ. Model. Softw..

[31]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[32]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[33]  Melvin Alexander Applied Statistics and Probability for Engineers , 1995 .

[34]  T. Perneger What's wrong with Bonferroni adjustments , 1998, BMJ.

[35]  Joshua M. Epstein,et al.  Zones of cooperation in demographic prisoner's dilemma , 1997, Complex..

[36]  B. Edmonds,et al.  Replication, Replication and Replication: Some hard lessons from model alignment , 2003, J. Artif. Soc. Soc. Simul..

[37]  Volker Grimm,et al.  Replicating and breaking models: good for you and good for ecology , 2015 .

[38]  M. Bartlett Properties of Sufficiency and Statistical Tests , 1992 .

[39]  Timothy Davison,et al.  Adaptive agent abstractions to speed up spatial agent-based simulations , 2014, Simul. Model. Pract. Theory.

[40]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[41]  B. Tabachnick,et al.  Using Multivariate Statistics , 1983 .

[42]  G. Box,et al.  A general distribution theory for a class of likelihood criteria. , 1949, Biometrika.

[43]  I. Jolliffe Principal Component Analysis , 2002 .

[44]  R. Sargent,et al.  Validation of Simulation Models via Simultaneous Confidence Intervals , 1984 .

[45]  L. Baringhaus,et al.  On a new multivariate two-sample test , 2004 .

[46]  Kenneth J. Berry,et al.  Multi-response permutation procedures for a priori classifications , 1976 .

[47]  Agostinho C. Rosa,et al.  Parallelization Strategies for Spatial Agent-Based Models , 2015, International Journal of Parallel Programming.

[48]  Ravi Bhavnani Adaptive Agents, Political Institutions and Civic Traditions in Modern Italy , 2003, J. Artif. Soc. Soc. Simul..

[49]  Hannu Oja,et al.  Multivariate Nonparametric Tests , 2004 .

[50]  Marti J. Anderson,et al.  A new method for non-parametric multivariate analysis of variance in ecology , 2001 .

[51]  Agostinho C. Rosa,et al.  Towards a standard model for research in agent-based modeling and simulation , 2015, PeerJ Prepr..

[52]  Michael Pidd,et al.  Simulation model reuse: definitions, benefits and obstacles , 2004, Simul. Model. Pract. Theory.

[53]  Bernhard Rengs,et al.  Prospects and Pitfalls of Statistical Testing: Insights from Replicating the Demographic Prisoner's Dilemma , 2010, J. Artif. Soc. Soc. Simul..

[54]  Osman Balci,et al.  Validation of multivariate response models using Hotelling's two-sample T2 test , 1982 .

[55]  Dirk Helbing,et al.  How to Do Agent-Based Simulations in the Future: From Modeling Social Mechanisms to Emergent Phenomena and Interactive Systems Design , 2013 .

[56]  Agostinho C. Rosa,et al.  micompr: An R Package for Multivariate Independent Comparison of Observations , 2016, R J..

[57]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[58]  J. Gareth Polhill,et al.  The ODD protocol: A review and first update , 2010, Ecological Modelling.

[59]  Armen Bagdasaryan,et al.  Discrete dynamic simulation models and technique for complex control systems , 2011, Simul. Model. Pract. Theory.

[60]  Rainer Hegselmann,et al.  A Replication That Failed - on the Computational Model in 'Michael W. Macy and Yoshimichi Sato: Trust, Cooperation and Market Formation in the U.S. and Japan. Proceedings of the National Academy of Sciences, May 2002' , 2008, J. Artif. Soc. Soc. Simul..

[61]  P. Rosenbaum An exact distribution‐free test comparing two multivariate distributions based on adjacency , 2005 .

[62]  Osman Balci,et al.  A methodology for cost-risk analysis in the statistical validation of simulation models , 1981, CACM.

[63]  R. Peng Reproducible Research in Computational Science , 2011, Science.

[64]  Agostinho C. Rosa,et al.  Spectrometric differentiation of yeast strains using minimum volume increase and minimum direction change clustering criteria , 2014, Pattern Recognit. Lett..

[65]  Agostinho C. Rosa,et al.  Simulating antigenic drift and shift in influenza A , 2009, SAC '09.

[66]  James R. Schott,et al.  Principles of Multivariate Analysis: A User's Perspective , 2002 .