Composite Goodness-of-fit Tests with Kernels

Model misspecification can create significant challenges for the implementation of probabilistic models, and this has led to development of a range of inference methods which directly account for the issue. However, these method tend to lose efficiency and should only be used when the model is really misspecified. Unfortunately, there is a lack of generally applicable methods to test whether this is the case or not. One set of tools which can help are goodness-of-fit tests, which can test whether a dataset has been generated from a fixed distribution. Kernel-based tests have been developed to for this problem, and these are popular due to their flexibility, strong theoretical guarantees and ease of implementation in a wide range of scenarios. In this paper, we extend this line of work to the more challenging composite goodness-of-fit problem, where we are instead interested in whether the data comes from any distribution in some parametric family.

[1]  X. Shao,et al.  The Dependent Wild Bootstrap , 2010 .

[2]  Richard Zemel,et al.  Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling , 2020, ICML.

[3]  M. Girolami,et al.  Control Functionals for Quasi-Monte Carlo Integration , 2015, AISTATS.

[4]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[5]  Vinayak A. Rao,et al.  A Stein-Papangelou Goodness-of-Fit Test for Point Processes , 2019, AISTATS.

[6]  Anne Leucht,et al.  Dependent wild bootstrap for degenerate U- and V-statistics , 2013, J. Multivar. Anal..

[7]  Arthur Gretton,et al.  A Kernel Test of Goodness of Fit , 2016, ICML.

[8]  Don R. Hush,et al.  An Explicit Description of the Reproducing Kernel Hilbert Spaces of Gaussian RBF Kernels , 2006, IEEE Transactions on Information Theory.

[9]  K. Roeder Density estimation with confidence sets exemplified by superclusters and voids in the galaxies , 1990 .

[10]  Franccois-Xavier Briol,et al.  Robust generalised Bayesian inference for intractable likelihoods , 2021, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[11]  A. Müller Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.

[12]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[13]  W. Stute,et al.  Bootstrap based goodness-of-fit-tests , 1993 .

[14]  J. Wolfowitz The Minimum Distance Method , 1957 .

[15]  Jos'e Miguel Hern'andez-Lobato,et al.  Sliced Kernelized Stein Discrepancy , 2020, ICLR.

[16]  Alessandro Barp,et al.  Minimum Stein Discrepancy Estimators , 2019, NeurIPS.

[17]  Alain Celisse,et al.  A one-sample test for normality with kernel methods , 2015, Bernoulli.

[18]  Arthur Gretton,et al.  Kernelized Stein Discrepancy Tests of Goodness-of-fit for Time-to-Event Data , 2020, ICML.

[19]  Kenji Fukumizu,et al.  A Kernel Stein Test for Comparing Latent Variable Models , 2019, Journal of the Royal Statistical Society Series B: Statistical Methodology.

[20]  Qiang Liu,et al.  A Kernelized Stein Discrepancy for Goodness-of-fit Tests , 2016, ICML.

[21]  Arthur Gretton,et al.  A Wild Bootstrap for Degenerate Kernel Tests , 2014, NIPS.

[22]  Qiang Liu,et al.  Goodness-of-fit Testing for Discrete Distributions via Stein Discrepancy , 2018, ICML.

[23]  Arthur Gretton,et al.  A maximum-mean-discrepancy goodness-of-fit test for censored data , 2018, AISTATS.

[24]  Takeru Matsuda,et al.  A Stein Goodness-of-fit Test for Directional Distributions , 2020, AISTATS.

[25]  M. Akritas,et al.  with censored data , 2003 .

[26]  M. Pierre,et al.  Probes for the large-scale structure , 1990 .

[27]  M. Postman,et al.  Probes of large-scale structure in the Corona Borealis region. , 1986 .

[28]  Gesine Reinert,et al.  A Stein Goodness of fit Test for Exponential Random Graph Models , 2021, 2103.00580.

[29]  Lester W. Mackey,et al.  Measuring Sample Quality with Stein's Method , 2015, NIPS.