An Extended Empirical Saddlepoint Approximation for Intractable Likelihoods

The challenges posed by complex stochastic models used in computational ecology, biology and genetics have stimulated the development of approximate approaches to statistical inference. Here we focus on Synthetic Likelihood (SL), a procedure that reduces the observed and simulated data to a set of summary statistics, and quantifies the discrepancy between them through a synthetic likelihood function. SL requires little tuning, but it relies on the approximate normality of the summary statistics. We relax this assumption by proposing a novel, more flexible, density estimator: the Extended Empirical Saddlepoint approximation. In addition to proving the consistency of SL, under either the new or the Gaussian density estimator, we illustrate the method using two examples. One of these is a complex individual-based forest model for which SL offers one of the few practical possibilities for statistical inference. The examples show that the new density estimator is able to capture large departures from normality, while being scalable to high dimensions, and this in turn leads to more accurate parameter estimates, relative to the Gaussian alternative. The new density estimator is implemented by the esaddle R package, which can be found on the Comprehensive R Archive Network (CRAN).

[1]  S. Wood Statistical inference for noisy nonlinear ecological dynamic systems , 2010, Nature.

[2]  Anna Clara Monti,et al.  On the relationship between empirical likelihood and empirical saddlepoint approximation for multivariate M-estimators , 1993 .

[3]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[4]  Andreas Huth,et al.  Lessons learned from applying a forest gap model to understand ecosystem and carbon dynamics of complex tropical forests , 2016 .

[5]  C. Andrieu,et al.  The pseudo-marginal approach for efficient Monte Carlo computations , 2009, 0903.5480.

[6]  Richard Wilkinson,et al.  Accelerating ABC methods using Gaussian processes , 2014, AISTATS.

[7]  E. Beckenbach CONVEX FUNCTIONS , 2007 .

[8]  Adam M. Johansen,et al.  A simple approach to maximum intractable likelihood estimation , 2013 .

[9]  E. Luciano,et al.  Copula methods in finance , 2004 .

[10]  Dennis Prangle,et al.  Adapting the ABC distance function , 2015, 1507.00874.

[11]  A. Doucet,et al.  Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models , 2013, 1304.5768.

[12]  Shih-Yu Wang General saddlepoint approximations in the bootstrap , 1992 .

[13]  Elvezio Ronchetti,et al.  Empirical Saddlepoint Approximations for Multivariate M-estimators , 1994 .

[14]  David T. Frazier,et al.  Bayesian Synthetic Likelihood , 2017, 2305.05120.

[15]  E. Ronchetti,et al.  General Saddlepoint Approximations with Applications to L Statistics , 1986 .

[16]  H. Joe Generating random correlation matrices based on partial correlations , 2006 .

[17]  Xiaotong Shen,et al.  Empirical Likelihood , 2002 .

[18]  D. Varberg Convex Functions , 1973 .

[19]  S. Sisson,et al.  A comparative review of dimension reduction methods in approximate Bayesian computation , 2012, 1202.3819.

[20]  P. McCullagh Tensor Methods in Statistics , 1987 .

[21]  Matteo Fasiolo,et al.  A comparison of inferential methods for highly nonlinear state space models in ecology and epidemiology , 2014 .

[22]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[23]  Boris Schröder,et al.  SIMULATING FOREST DYNAMICS OF A TROPICAL MONTANE FOREST IN SOUTH ECUADOR , 2009 .

[24]  Anthony Lee,et al.  Parallel Resampling in the Particle Filter , 2013, 1301.4019.

[25]  Andreas Huth,et al.  Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model , 2014, 1401.8205.

[26]  E L Ionides,et al.  Inference for nonlinear dynamical systems , 2006, Proceedings of the National Academy of Sciences.

[27]  Calyampudi R. Rao,et al.  Linear statistical inference and its applications , 1965 .

[28]  Anthony C. Davison,et al.  Saddlepoint approximations in resampling methods , 1988 .

[29]  Calyampudi Radhakrishna Rao,et al.  Linear Statistical Inference and its Applications , 1967 .

[30]  Max Welling,et al.  GPS-ABC: Gaussian Process Surrogate Approximate Bayesian Computation , 2014, UAI.

[31]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[32]  CoranderJukka,et al.  Bayesian optimization for likelihood-free inference of simulator-based statistical models , 2016 .

[33]  Juan Carlos Abril Approximations for Densities of Sufficient Estimators , 2011, International Encyclopedia of Statistical Science.

[34]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[35]  A. C. Rencher Methods of multivariate analysis , 1995 .

[36]  P. Diggle,et al.  Monte Carlo Methods of Inference for Implicit Statistical Models , 1984 .

[37]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[38]  Ronald W. Butler,et al.  Saddlepoint Approximations with Applications: Preface , 2007 .

[39]  Yves F. Atchad'e,et al.  Iterated filtering , 2009, 0902.0347.

[40]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[41]  L. Excoffier,et al.  Efficient Approximate Bayesian Computation Coupled With Markov Chain Monte Carlo Without Likelihood , 2009, Genetics.

[42]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[43]  A penalized version of the empirical likelihood ratio for the population mean , 2007 .

[44]  Richard G. Everitt,et al.  Bayesian model comparison with un-normalised likelihoods , 2015, Stat. Comput..

[45]  H. Daniels Saddlepoint Approximations in Statistics , 1954 .

[46]  Jun Yan,et al.  Enjoy the Joy of Copulas: With a Package copula , 2007 .

[47]  Michael U. Gutmann,et al.  Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models , 2015, J. Mach. Learn. Res..

[48]  M. Blum Approximate Bayesian Computation: A Nonparametric Perspective , 2009, 0904.0635.

[49]  Matteo Fasiolo,et al.  Approximate methods for dynamic ecological models , 2015, 1511.02644.

[50]  A. Feuerverger,et al.  On the empirical saddlepoint approximation , 1989 .

[51]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[52]  Andreas Huth,et al.  Statistical inference for stochastic simulation models--theory and application. , 2011, Ecology letters.

[53]  W. Newey,et al.  Uniform Convergence in Probability and Stochastic Equicontinuity , 1991 .