Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models

Our paper deals with inferring simulator-based statistical models given some observed data. A simulator-based model is a parametrized mechanism which specifies how data are generated. It is thus also referred to as generative model. We assume that only a finite number of parameters are of interest and allow the generative process to be very general; it may be a noisy nonlinear dynamical system with an unrestricted number of hidden variables. This weak assumption is useful for devising realistic models but it renders statistical inference very difficult. The main challenge is the intractability of the likelihood function. Several likelihood-free inference methods have been proposed which share the basic idea of identifying the parameters by finding values for which the discrepancy between simulated and observed data is small. A major obstacle to using these methods is their computational cost. The cost is largely due to the need to repeatedly simulate data sets and the lack of knowledge about how the parameters affect the discrepancy. We propose a strategy which combines probabilistic modeling of the discrepancy with optimization to facilitate likelihood-free inference. The strategy is implemented using Bayesian optimization and is shown to accelerate the inference through a reduction in the number of required simulations by several orders of magnitude.

[1]  Alan Fern,et al.  Batch Bayesian Optimization via Simulation Matching , 2010, NIPS.

[2]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[3]  Jeong-Soo Park,et al.  A statistical method for tuning a computer code to a data base , 2001 .

[4]  Paul Fearnhead,et al.  Constructing summary statistics for approximate Bayesian computation: semi‐automatic approximate Bayesian computation , 2012 .

[5]  Clayton D. Scott,et al.  Robust kernel density estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[7]  D. Dennis,et al.  A statistical method for global optimization , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.

[8]  Andreas Krause,et al.  High-Dimensional Gaussian Process Bandits , 2013, NIPS.

[9]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[10]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[11]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[12]  M. Blum Approximate Bayesian Computation: A Nonparametric Perspective , 2009, 0904.0635.

[13]  Andreas Krause,et al.  Joint Optimization and Variable Selection of High-dimensional Gaussian Processes , 2012, ICML.

[14]  Aapo Hyv Estimation of Non-Normalized Statistical Models by Score Matching , 2005 .

[15]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[16]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[17]  H. Niederreiter Low-discrepancy and low-dispersion sequences , 1988 .

[18]  Junichiro Hirayama,et al.  Bregman divergence as general framework to estimate unnormalized statistical models , 2011, UAI.

[19]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[20]  Andreas Huth,et al.  Statistical inference for stochastic simulation models--theory and application. , 2011, Ecology letters.

[21]  P. Diggle,et al.  Monte Carlo Methods of Inference for Implicit Statistical Models , 1984 .

[22]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[23]  L. Excoffier,et al.  Efficient Approximate Bayesian Computation Coupled With Markov Chain Monte Carlo Without Likelihood , 2009, Genetics.

[24]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[25]  A. OHagan,et al.  Bayesian analysis of computer code outputs: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[26]  Le Song,et al.  Kernel Bayes' rule: Bayesian inference with positive definite kernels , 2013, J. Mach. Learn. Res..

[27]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[28]  Aapo Hyvärinen,et al.  A Family of Computationally E cient and Simple Estimators for Unnormalized Statistical Models , 2010, UAI.

[29]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[30]  A. P. Dawid,et al.  Gaussian Processes to Speed up Hybrid Monte Carlo for Expensive Bayesian Integrals , 2003 .

[31]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[32]  Andrew Gordon Wilson,et al.  Student-t Processes as Alternatives to Gaussian Processes , 2014, AISTATS.

[33]  M. Rosenblatt,et al.  Multivariate k-nearest neighbor density estimates , 1979 .

[34]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[35]  S. Wood Statistical inference for noisy nonlinear ecological dynamic systems , 2010, Nature.

[36]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Ritabrata Dutta,et al.  Likelihood-free inference via classification , 2014, Stat. Comput..

[38]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[39]  Richard Wilkinson,et al.  Accelerating ABC methods using Gaussian processes , 2014, AISTATS.

[40]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[41]  A. Futschik,et al.  A Novel Approach for Choosing Summary Statistics in Approximate Bayesian Computation , 2012, Genetics.

[42]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[43]  D. Balding,et al.  Statistical Applications in Genetics and Molecular Biology On Optimal Selection of Summary Statistics for Approximate Bayesian Computation , 2011 .

[44]  J. Marin,et al.  Population Monte Carlo , 2004 .

[45]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[46]  Nando de Freitas,et al.  Bayesian Optimization in High Dimensions via Random Embeddings , 2013, IJCAI.

[47]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[48]  Andreas Krause,et al.  Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[49]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[50]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[51]  Max Welling,et al.  GPS-ABC: Gaussian Process Surrogate Approximate Bayesian Computation , 2014, UAI.

[52]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[53]  Benjamin Van Roy,et al.  Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..

[54]  W. Ricker Stock and Recruitment , 1954 .

[55]  D. Dennis,et al.  SDO : A Statistical Method for Global Optimization , 1997 .

[56]  Kirthevasan Kandasamy,et al.  High Dimensional Bayesian Optimisation and Bandits via Additive Models , 2015, ICML.

[57]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[58]  Mats Gyllenberg,et al.  Estimating the Transmission Dynamics of Streptococcus pneumoniae from Strain Prevalence Data , 2013, Biometrics.

[59]  T. J. Mitchell,et al.  Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments , 1991 .

[60]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[61]  Paul Marjoram,et al.  Statistical Applications in Genetics and Molecular Biology Approximately Sufficient Statistics and Bayesian Computation , 2011 .

[62]  Nicolas Vayatis,et al.  Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration , 2013, ECML/PKDD.

[63]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[64]  Larry Wasserman,et al.  All of Statistics , 2004 .

[65]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[66]  Jean-Michel Marin,et al.  Approximate Bayesian computational methods , 2011, Statistics and Computing.

[67]  M. Gutmann,et al.  Statistical Inference of Intractable Generative Models via Classification , 2014 .