Probabilistic Integration

Probabilistic numerical methods aim to model numerical error as a source of epistemic uncertainty that is subject to probabilistic analysis and reasoning, enabling the principled propagation of numerical uncertainty through a computational pipeline. In this paper we focus on numerical methods for integration. We present probabilistic (Bayesian) versions of both Markov chain and Quasi Monte Carlo methods for integration and provide rigorous theoretical guarantees for convergence rates, in both posterior mean and posterior contraction. The performance of probabilistic integrators is guaranteed to be no worse than non-probabilistic integrators and is, in many cases, asymptotically superior. These probabilistic integrators therefore enjoy the “best of both worlds”, leveraging the sampling efficiency of advanced Monte Carlo methods whilst being equipped with valid probabilistic models for uncertainty quantification. Several applications and illustrations are provided, including examples from computer vision and system modelling using non-linear differential equations. A survey of open challenges in probabilistic integration is provided.

[1]  S. Gupta,et al.  Statistical decision theory and related topics IV , 1988 .

[2]  N. S. Bakhvalov,et al.  On the optimality of linear methods for operator approximation in convex classes of functions , 1971 .

[3]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[4]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  M. Girolami,et al.  Riemann manifold Langevin and Hamiltonian Monte Carlo methods , 2011, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[6]  Klaus Ritter,et al.  Bayesian numerical analysis , 2000 .

[7]  Le Song,et al.  Scalable Kernel Methods via Doubly Stochastic Gradients , 2014, NIPS.

[8]  John Langford,et al.  Hash Kernels for Structured Data , 2009, J. Mach. Learn. Res..

[9]  Philipp Hennig,et al.  Probabilistic Line Searches for Stochastic Optimization , 2015, NIPS.

[10]  Henryk Wozniakowski,et al.  Exponential convergence and tractability of multivariate integration for Korobov spaces , 2011, Math. Comput..

[11]  Carl E. Rasmussen,et al.  Bayesian Monte Carlo , 2002, NIPS.

[12]  Michael W. Mahoney,et al.  Fast Randomized Kernel Methods With Statistical Guarantees , 2014, ArXiv.

[13]  Michael A. Osborne,et al.  Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[14]  Erich Novak,et al.  A Universal Algorithm for Multivariate Integration , 2015, Found. Comput. Math..

[15]  A. Stuart,et al.  The Bayesian Approach to Inverse Problems , 2013, 1302.6989.

[16]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[17]  Richard Nickl,et al.  Nonparametric Bayesian posterior contraction rates for discretely observed scalar diffusions , 2015, 1510.05526.

[18]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[19]  Martin Kiefel,et al.  Quasi-Newton Methods: A New Direction , 2012, ICML.

[20]  Patrick R. Conrad,et al.  Probability Measures for Numerical Solutions of Differential Equations , 2015, 1506.04592.

[21]  Matthias Katzfuss,et al.  A Multi-Resolution Approximation for Massive Spatial Datasets , 2015, 1507.04789.

[22]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[23]  E. Novak,et al.  Tractability of Multivariate Problems , 2008 .

[24]  N. Chopin,et al.  Control functionals for Monte Carlo integration , 2014, 1410.2392.

[25]  Holger Wendland,et al.  Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree , 1995, Adv. Comput. Math..

[26]  M. Girolami,et al.  Control Functionals for Quasi-Monte Carlo Integration , 2015, AISTATS.

[27]  Holger Wendland,et al.  Multiscale approximation for functions in arbitrary Sobolev spaces by scaled radial basis functions on the unit sphere , 2012 .

[28]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[29]  David Duvenaud,et al.  Probabilistic ODE Solvers with Runge-Kutta Means , 2014, NIPS.

[30]  J. Richard Swenson,et al.  Tests of probabilistic models for propagation of roundoff errors , 1966, CACM.

[31]  R. Womersley,et al.  Quasi-Monte Carlo for Highly Structured Generalised Response Models , 2008 .

[32]  Frances Y. Kuo,et al.  Component-by-component constructions achieve the optimal rate of convergence for multivariate integration in weighted Korobov and Sobolev spaces , 2003, J. Complex..

[33]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[34]  Yee Whye Teh,et al.  Mondrian Forests for Large-Scale Regression when Uncertainty Matters , 2015, AISTATS.

[35]  Ian H. Sloan,et al.  QMC designs: Optimal order Quasi Monte Carlo integration schemes on the sphere , 2012, Math. Comput..

[36]  Philipp Hennig,et al.  Probabilistic Interpretation of Linear Solvers , 2014, SIAM J. Optim..

[37]  Daniel W. Apley,et al.  Local Gaussian Process Approximation for Large Computer Experiments , 2013, 1303.0383.

[38]  Anthony O'Hagan,et al.  Diagnostics for Gaussian Process Emulators , 2009, Technometrics.

[39]  Carl E. Rasmussen,et al.  Active Learning of Model Evidence Using Bayesian Quadrature , 2012, NIPS.

[40]  A. Owen,et al.  Control variates for quasi-Monte Carlo , 2005 .

[41]  Greg Humphreys,et al.  Physically Based Rendering: From Theory to Implementation , 2004 .

[42]  Fred J. Hickernell,et al.  On Dimension-independent Rates of Convergence for Function Approximation with Gaussian Kernels , 2012, SIAM J. Numer. Anal..

[43]  Alexander J. Smola,et al.  Super-Samples from Kernel Herding , 2010, UAI.

[44]  Joshua B. Tenenbaum,et al.  Structure Discovery in Nonparametric Regression through Compositional Kernel Search , 2013, ICML.

[45]  Andrew Gordon Wilson,et al.  Student-t Processes as Alternatives to Gaussian Processes , 2014, AISTATS.

[46]  Winfried Sickel,et al.  Tensor products of Sobolev-Besov spaces and applications to approximation from the hyperbolic cross , 2009, J. Approx. Theory.

[47]  Frances Y. Kuo,et al.  High-dimensional integration: The quasi-Monte Carlo way*† , 2013, Acta Numerica.

[48]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[49]  Henryk Wozniakowski,et al.  When Are Quasi-Monte Carlo Algorithms Efficient for High Dimensional Integrals? , 1998, J. Complex..

[50]  Andriy Bondarenko,et al.  Optimal asymptotic bounds for spherical designs , 2010, 1009.4407.

[51]  Luís Paulo Santos,et al.  Efficient Quadrature Rules for Illumination Integrals: From Quasi Monte Carlo to Bayesian Monte Carlo , 2015, Efficient Quadrature Rules for Illumination Integrals: From Quasi Monte Carlo to Bayesian Monte Carlo.

[52]  Fredrik Lindsten,et al.  Sequential Kernel Herding: Frank-Wolfe Optimization for Particle Filtering , 2015, AISTATS.

[53]  N. Chopin,et al.  Sequential Quasi-Monte Carlo , 2014, 1402.4039.

[54]  Milan Lukić,et al.  Stochastic processes with sample paths in reproducing kernel Hilbert spaces , 2001 .

[55]  Jouni Hartikainen,et al.  On the relation between Gaussian process quadratures and sigma-point methods , 2015, 1504.05994.

[56]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[57]  Michael A. Osborne,et al.  Frank-Wolfe Bayesian Quadrature: Probabilistic Integration with Theoretical Guarantees , 2015, NIPS.

[58]  A. O'Hagan,et al.  Bayes–Hermite quadrature , 1991 .

[59]  Harry van Zanten,et al.  Information Rates of Nonparametric Gaussian Process Methods , 2011, J. Mach. Learn. Res..

[60]  Michael A. Osborne Bayesian Gaussian processes for sequential prediction, optimisation and quadrature , 2010 .

[61]  Luís Paulo Santos,et al.  A Spherical Gaussian Framework for Bayesian Monte Carlo Rendering of Glossy Surfaces , 2013, IEEE Transactions on Visualization and Computer Graphics.

[62]  Francis R. Bach,et al.  On the Equivalence between Quadrature Rules and Random Features , 2015, ArXiv.

[63]  F. Pillichshammer,et al.  Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration , 2010 .

[64]  Francis R. Bach,et al.  Sharp analysis of low-rank kernel matrix approximations , 2012, COLT.

[65]  Eric Darve,et al.  The Inverse Fast Multipole Method , 2014, ArXiv.

[66]  Art B. Owen,et al.  A constraint on extensible quadrature rules , 2014, Numerische Mathematik.

[67]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[68]  Carl E. Rasmussen,et al.  Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[69]  Alexander J. Smola,et al.  Unifying Divergence Minimization and Statistical Inference Via Convex Duality , 2006, COLT.

[70]  A. W. Vaart,et al.  Frequentist coverage of adaptive nonparametric Bayesian credible sets , 2013, 1310.4489.

[71]  H. Wozniakowski,et al.  Gauss-Hermite quadratures for functions from Hilbert spaces with Gaussian reproducing kernels , 2012 .

[72]  J. Dick Higher order scrambled digital nets achieve the optimal rate of the root mean square error for smooth integrands , 2010, 1007.0842.

[73]  E. Novak,et al.  Tractability of Multivariate Problems Volume II: Standard Information for Functionals , 2010 .

[74]  F. e. Calcul des Probabilités , 1889, Nature.

[75]  Ian H. Sloan,et al.  Worst-case errors in a Sobolev space setting for cubature over the sphere $S^2$ , 2005 .

[76]  Nando de Freitas,et al.  Bayesian Optimization in High Dimensions via Random Embeddings , 2013, IJCAI.

[77]  J. Seidel,et al.  SPHERICAL CODES AND DESIGNS , 1991 .

[78]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[79]  Sebastian Mosbach,et al.  A quantitative probabilistic investigation into the accumulation of rounding errors in numerical ODE solution , 2009, Comput. Math. Appl..

[80]  David Duvenaud,et al.  Optimally-Weighted Herding is Bayesian Quadrature , 2012, UAI.

[81]  Christian Bouville,et al.  A Bayesian Monte Carlo Approach to Global Illumination , 2009, Comput. Graph. Forum.

[82]  Frances Y. Kuo,et al.  On the Choice of Weights in a Function Space for Quasi-Monte Carlo Methods for a Class of Generalised Response Models in Statistics , 2013 .

[83]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[84]  Roman Garnett,et al.  Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature , 2014, NIPS.

[85]  Holger Wendland,et al.  Sobolev bounds on functions with scattered zeros, with applications to radial basis function surface fitting , 2004, Math. Comput..