B-test: A Non-parametric, Low Variance Kernel Two-sample Test

We propose a family of maximum mean discrepancy (MMD) kernel two-sample tests that have low sample complexity and are consistent. The test has a hyperparameter that allows one to control the tradeoff between sample complexity and computational time. Our family of tests, which we denote as B-tests, is both computationally and statistically efficient, combining favorable properties of previously proposed MMD two-sample tests. It does so by better leveraging samples to produce low variance estimates in the finite sample case, while avoiding a quadratic number of kernel evaluations and complex null-hypothesis approximation as would be required by tests relying on one sample U-statistics. The B-test uses a smaller than quadratic number of kernel evaluations and avoids completely the computational burden of complex null-hypothesis approximation while maintaining consistency and probabilistically conservative thresholds on Type I error. Finally, recent results of combining multiple kernels transfer seamlessly to our hypothesis test, allowing a further increase in discriminative power and decrease in sample complexity.

[1]  AN Kolmogorov-Smirnov,et al.  Sulla determinazione empírica di uma legge di distribuzione , 1933 .

[2]  A. C. Berry The accuracy of the Gaussian approximation to the sum of independent variates , 1941 .

[3]  N. Smirnov Table for Estimating the Goodness of Fit of Empirical Distributions , 1948 .

[4]  B. V. Bahr On the Convergence of Moments in the Central Limit Theorem , 1965 .

[5]  Samuel Kotz,et al.  Continuous univariate distributions : distributions in statistics , 1970 .

[6]  R. Serfling Approximation Theorems of Mathematical Statistics , 1980 .

[7]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[8]  Bernhard Schölkopf,et al.  Support vector learning , 1997 .

[9]  L. Baringhaus,et al.  On a new multivariate two-sample test , 2004 .

[10]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[11]  G. Székely,et al.  TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION , 2004 .

[12]  Grace S. Shieh,et al.  Two‐stage U‐statistics for Hypothesis Testing , 2006 .

[13]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[14]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[15]  Zaïd Harchaoui,et al.  Testing for Homogeneity with Kernel Fisher Discriminant Analysis , 2007, NIPS.

[16]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[17]  Zaïd Harchaoui,et al.  A Fast, Consistent Kernel Two-Sample Test , 2009, NIPS.

[18]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[19]  Sugiyama Masashi,et al.  Relative Density-Ratio Estimation for Robust Distribution Comparison , 2011 .

[20]  Purnamrita Sarkar,et al.  A scalable bootstrap for massive data , 2011, 1112.5016.

[21]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[22]  Kenji Fukumizu,et al.  Hypothesis testing using pairwise distances and associated kernels , 2012, ICML.

[23]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[24]  Matthieu Lerasle,et al.  Kernels Based Tests with Non-asymptotic Bootstrap Approaches for Two-sample Problems , 2012, COLT.