Local dimension reduction of summary statistics for likelihood-free inference

Approximate Bayesian computation (ABC) and other likelihood-free inference methods have gained popularity in the last decade, as they allow rigorous statistical inference for complex models without analytically tractable likelihood functions. A key component for accurate inference with ABC is the choice of summary statistics, which summarize the information in the data, but at the same time should be low-dimensional for efficiency. Several dimension reduction techniques have been introduced to automatically construct informative and low-dimensional summaries from a possibly large pool of candidate summaries. Projection-based methods, which are based on learning simple functional relationships from the summaries to parameters, are widely used and usually perform well, but might fail when the assumptions behind the transformation are not satisfied. We introduce a localization strategy for any projection-based dimension reduction method, in which the transformation is estimated in the neighborhood of the observed data instead of the whole space. Localization strategies have been suggested before, but the performance of the transformed summaries outside the local neighborhood has not been guaranteed. In our localization approach the transformation is validated and optimized over validation datasets, ensuring reliable performance. We demonstrate the improvement in the estimation accuracy for localized versions of linear regression and partial least squares, for three different models of varying complexity.

[1]  Paul Marjoram,et al.  Statistical Applications in Genetics and Molecular Biology Approximately Sufficient Statistics and Bayesian Computation , 2011 .

[2]  Yanan Fan,et al.  Handbook of Approximate Bayesian Computation , 2018 .

[3]  S. Sisson,et al.  A comparative review of dimension reduction methods in approximate Bayesian computation , 2012, 1202.3819.

[4]  Jukka Corander,et al.  Inferring Cognitive Models from Data using Approximate Bayesian Computation , 2016, CHI.

[5]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[6]  Michael U. Gutmann,et al.  Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models , 2015, J. Mach. Learn. Res..

[7]  Luc Lens,et al.  Assessing the dynamics of natural populations by fitting individual‐based models with approximate Bayesian computation , 2018 .

[8]  Paul Fearnhead,et al.  Constructing Summary Statistics for Approximate Bayesian Computation: Semi-automatic ABC , 2010, 1004.1112.

[9]  L. Excoffier,et al.  Efficient Approximate Bayesian Computation Coupled With Markov Chain Monte Carlo Without Likelihood , 2009, Genetics.

[10]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[11]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[12]  Anthony N. Pettitt,et al.  Bayesian indirect inference using a parametric auxiliary model , 2015, 1505.03372.

[13]  A. Futschik,et al.  A Novel Approach for Choosing Summary Statistics in Approximate Bayesian Computation , 2012, Genetics.

[14]  Dennis Prangle,et al.  Adapting the ABC distance function , 2015, 1507.00874.

[15]  M. Gutmann,et al.  Fundamentals and Recent Developments in Approximate Bayesian Computation , 2016, Systematic biology.

[16]  S. Wood Statistical inference for noisy nonlinear ecological dynamic systems , 2010, Nature.

[17]  D. Balding,et al.  Statistical Applications in Genetics and Molecular Biology On Optimal Selection of Summary Statistics for Approximate Bayesian Computation , 2011 .

[18]  ELFI ELFI: Engine for Likelihood-Free Inference , 2018 .

[19]  Dennis Prangle,et al.  Summary Statistics in Approximate Bayesian Computation , 2015, 1512.05633.

[20]  Gareth W. Peters,et al.  Sequential Monte Carlo-ABC Methods for Estimation of Stochastic Simulation Models of the Limit Order Book , 2018 .

[21]  Ritabrata Dutta,et al.  Likelihood-free inference via classification , 2014, Stat. Comput..

[22]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[23]  K. Mengersen,et al.  Robustness of ranking and selection rules using generalised g-and-k distributions , 1997 .

[24]  Olivier François,et al.  Non-linear regression models for Approximate Bayesian Computation , 2008, Stat. Comput..

[25]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[26]  Stuart Barber,et al.  The Rate of Convergence for Approximate Bayesian Computation , 2013, 1311.2038.

[27]  G. Casella,et al.  Report of the Editors—2011 , 2012 .

[28]  S. A. Sisson,et al.  Overview of Approximate Bayesian Computation , 2018, 1802.09720.