Convergence Analysis of Deterministic Kernel-Based Quadrature Rules in Misspecified Settings

This paper presents convergence analysis of kernel-based quadrature rules in misspecified settings, focusing on deterministic quadrature in Sobolev spaces. In particular, we deal with misspecified settings where a test integrand is less smooth than a Sobolev RKHS based on which a quadrature rule is constructed. We provide convergence guarantees based on two different assumptions on a quadrature rule: one on quadrature weights and the other on design points. More precisely, we show that convergence rates can be derived (i) if the sum of absolute weights remains constant (or does not increase quickly), or (ii) if the minimum distance between design points does not decrease very quickly. As a consequence of the latter result, we derive a rate of convergence for Bayesian quadrature in misspecified settings. We reveal a condition on design points to make Bayesian quadrature robust to misspecification, and show that, under this condition, it may adaptively achieve the optimal rate of convergence in the Sobolev space of a lesser order (i.e., of the unknown smoothness of a test integrand), under a slightly stronger regularity condition on the integrand.

[1]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[2]  E. Stein Singular Integrals and Di?erentiability Properties of Functions , 1971 .

[3]  P. Diaconis Bayesian Numerical Analysis , 1988 .

[4]  E. Novak Deterministic and Stochastic Error Bounds in Numerical Analysis , 1988 .

[5]  A. O'Hagan,et al.  Bayes–Hermite quadrature , 1991 .

[6]  G. Weiss,et al.  Littlewood-Paley Theory and the Study of Function Spaces , 1991 .

[7]  L. R. Scott,et al.  The Mathematical Theory of Finite Element Methods , 1994 .

[8]  Robert Schaback,et al.  Error estimates and condition numbers for radial basis function interpolation , 1995, Adv. Comput. Math..

[9]  Holger Wendland,et al.  Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree , 1995, Adv. Comput. Math..

[10]  Henryk Wozniakowski,et al.  When Are Quasi-Monte Carlo Algorithms Efficient for High Dimensional Integrals? , 1998, J. Complex..

[11]  Fred J. Hickernell,et al.  A generalized discrepancy and quadrature error bound , 1998, Math. Comput..

[12]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[13]  Carl E. Rasmussen,et al.  Bayesian Monte Carlo , 2002, NIPS.

[14]  Holger Wendland,et al.  Scattered Data Approximation: Conditionally positive definite functions , 2004 .

[15]  Joseph D. Ward,et al.  Scattered-Data Interpolation on Rn: Error Estimates for Radial Basis and Band-Limited Functions , 2004, SIAM J. Math. Anal..

[16]  Holger Wendland,et al.  Sobolev bounds on functions with scattered zeros, with applications to radial basis function surface fitting , 2004, Math. Comput..

[17]  Alvise Sommariva,et al.  Numerical Cubature on Scattered Data by Radial Basis Functions , 2005, Computing.

[18]  F. J. Narcowich,et al.  Sobolev Error Estimates and a Bernstein Inequality for Scattered Data Interpolation via Radial Basis Functions , 2006 .

[19]  Holger Wendland,et al.  Kernel techniques: From machine learning to meshless methods , 2006, Acta Numerica.

[20]  Ding-Xuan Zhou,et al.  Learning Theory: An Approximation Theory Viewpoint , 2007 .

[21]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .

[22]  Josef Dick,et al.  Explicit Constructions of Quasi-Monte Carlo Rules for the Numerical Integration of High-Dimensional Periodic Functions , 2007, SIAM J. Numer. Anal..

[23]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint (Cambridge Monographs on Applied & Computational Mathematics) , 2007 .

[24]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[25]  H. Triebel Theory of Function Spaces III , 2008 .

[26]  Josef Dick,et al.  Walsh Spaces Containing Smooth Functions and Quasi-Monte Carlo Rules of Arbitrary High Order , 2008, SIAM J. Numer. Anal..

[27]  J. Dick Higher order scrambled digital nets achieve the optimal rate of the root mean square error for smooth integrands , 2010, 1007.0842.

[28]  E. Novak,et al.  Tractability of Multivariate Problems Volume II: Standard Information for Functionals , 2010 .

[29]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[30]  David Duvenaud,et al.  Optimally-Weighted Herding is Bayesian Quadrature , 2012, UAI.

[31]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[32]  Carl E. Rasmussen,et al.  Active Learning of Model Evidence Using Bayesian Quadrature , 2012, NIPS.

[33]  Francis R. Bach,et al.  On the Equivalence between Herding and Conditional Gradient Algorithms , 2012, ICML.

[34]  G. Burton Sobolev Spaces , 2013 .

[35]  Frances Y. Kuo,et al.  High-dimensional integration: The quasi-Monte Carlo way*† , 2013, Acta Numerica.

[36]  M. Plummer,et al.  A Bayesian information criterion for singular models , 2013, 1309.0911.

[37]  N. Chopin,et al.  Sequential Quasi-Monte Carlo , 2014, 1402.4039.

[38]  Joseph D. Ward,et al.  Kernel based quadrature on spheres and other homogeneous spaces , 2012, Numerische Mathematik.

[39]  Erich Novak,et al.  Some Results on the Complexity of Numerical Integration , 2014, MCQMC.

[40]  N. Chopin,et al.  Control functionals for Monte Carlo integration , 2014, 1410.2392.

[41]  Dirk Nuyens,et al.  Lattice rules for nonperiodic smooth integrands , 2014, Numerische Mathematik.

[42]  Roman Garnett,et al.  Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature , 2014, NIPS.

[43]  Michael A. Osborne,et al.  Probabilistic Integration: A Role for Statisticians in Numerical Analysis? , 2015 .

[44]  Fredrik Lindsten,et al.  Sequential Kernel Herding: Frank-Wolfe Optimization for Particle Filtering , 2015, AISTATS.

[45]  Jouni Hartikainen,et al.  On the relation between Gaussian process quadratures and sigma-point methods , 2015, 1504.05994.

[46]  Josef Dick,et al.  Construction of Interlaced Scrambled Polynomial Lattice Rules of Arbitrary High Order , 2013, Found. Comput. Math..

[47]  Michael A. Osborne,et al.  Frank-Wolfe Bayesian Quadrature: Probabilistic Integration with Theoretical Guarantees , 2015, NIPS.

[48]  Mark Girolami,et al.  The Controlled Thermodynamic Integral for Bayesian Model Evidence Evaluation , 2016 .

[49]  Philipp Hennig,et al.  Active Uncertainty Calibration in Bayesian ODE Solvers , 2016, UAI.

[50]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[51]  Vikas Sindhwani,et al.  Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels , 2014, J. Mach. Learn. Res..

[52]  M. Girolami,et al.  Control Functionals for Quasi-Monte Carlo Integration , 2015, AISTATS.

[53]  M. Urner Scattered Data Approximation , 2016 .

[54]  Kenji Fukumizu,et al.  Filtering with State-Observation Examples via Kernel Monte Carlo Filter , 2013, Neural Computation.

[55]  Kenji Fukumizu,et al.  Convergence guarantees for kernel-based quadrature rules in misspecified settings , 2016, NIPS.

[56]  Mark A. Girolami,et al.  Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models , 2016, NIPS.

[57]  Mark A. Girolami,et al.  On the Sampling Problem for Kernel Quadrature , 2017, ICML.

[58]  Francis R. Bach,et al.  On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..

[59]  Francis R. Bach,et al.  Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..

[60]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[61]  Mark A. Girolami,et al.  Bayesian Quadrature for Multiple Related Integrals , 2018, ICML.

[62]  Simo Särkkä,et al.  A Bayes-Sard Cubature Method , 2018, NeurIPS.

[63]  Lester W. Mackey,et al.  Stein Points , 2018, ICML.

[64]  Shimon Whiteson,et al.  Alternating Optimisation and Quadrature for Robust Control , 2016, AAAI.

[65]  Michael A. Osborne,et al.  Rejoinder for "Probabilistic Integration: A Role in Statistical Computation?" , 2018, Statistical Science.

[66]  Michael A. Osborne,et al.  Probabilistic Integration: A Role in Statistical Computation? , 2015, Statistical Science.

[67]  M. Girolami,et al.  Convergence rates for a class of estimators based on Stein’s method , 2016, Bernoulli.

[68]  F. Lutscher Spatial Variation , 2019, Interdisciplinary Applied Mathematics.