Kernel Two-Sample and Independence Tests for Nonstationary Random Processes

Two-sample and independence tests with the kernel-based MMD and HSIC have shown remarkable results on i.i.d. data and stationary random processes. However, these statistics are not directly applicable to non-stationary random processes, a prevalent form of data in many scientific disciplines. In this work, we extend the application of MMD and HSIC to non-stationary settings by assuming access to independent realisations of the underlying random process. These realisations - in the form of non-stationary time-series measured on the same temporal grid - can then be viewed as i.i.d. samples from a multivariate probability distribution, to which MMD and HSIC can be applied. We further show how to choose suitable kernels over these high-dimensional spaces by maximising the estimated test power with respect to the kernel hyper-parameters. In experiments on synthetic data, we demonstrate superior performance of our proposed approaches in terms of test power when compared to current state-of-the-art functional or multivariate two-sample and independence tests. Finally, we employ our methods on a real socio-economic dataset as an example application.

[1]  A. Duncan,et al.  A Kernel Two-Sample Test for Functional Data , 2020, J. Mach. Learn. Res..

[2]  Robert M. Bond Complex networks: Network healing after loss , 2017, Nature Human Behaviour.

[3]  Richard A. Davis,et al.  Applications of distance correlation to time series , 2016, Bernoulli.

[4]  Le Song,et al.  Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[5]  Ron Reeder,et al.  Estimation of the mean of functional time series and a two‐sample problem , 2011, 1105.0019.

[6]  Arthur Gretton,et al.  A Wild Bootstrap for Degenerate Kernel Tests , 2014, NIPS.

[7]  T. Robinson,et al.  Sustainable Development Goals , 2016 .

[8]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[9]  Bernhard Schölkopf,et al.  Statistical analysis of coupled time series with Kernel Cross-Spectral Density operators , 2013, NIPS.

[10]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[11]  Ana-Maria Staicu,et al.  A two‐sample distribution‐free test for functional data with application to a diffusion tensor imaging study of multiple sclerosis , 2016, Journal of the Royal Statistical Society. Series C, Applied statistics.

[12]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[13]  Zaïd Harchaoui,et al.  A Fast, Consistent Kernel Two-Sample Test , 2009, NIPS.

[14]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[15]  Bernhard Schölkopf,et al.  Kernel Methods for Measuring Independence , 2005, J. Mach. Learn. Res..

[16]  John H. Maddocks,et al.  Second-Order Comparison of Gaussian Random Functions and the Geometry of DNA Minicircles , 2010 .

[17]  Alexander J. Smola,et al.  Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy , 2016, ICLR.

[18]  Barnabás Póczos,et al.  On the High Dimensional Power of a Linear-Time Two Sample Test under Mean-shift Alternatives , 2015, AISTATS.

[19]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[20]  Tomasz Górecki,et al.  Independence test and canonical correlation analysis based on the alignment between kernel matrices for multivariate functional data , 2018, Artificial Intelligence Review.

[21]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[22]  D. French,et al.  Sustainable Development Goals , 2021, Encyclopedia of the UN Sustainable Development Goals.

[23]  K SriperumbudurBharath,et al.  Universality, Characteristic Kernels and RKHS Embedding of Measures , 2011 .

[24]  Piotr Kokoszka,et al.  Testing the Equality of Covariance Operators in Functional Samples , 2011, 1104.4049.

[25]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[26]  Arthur Gretton,et al.  Large-scale kernel methods for independence testing , 2016, Statistics and Computing.

[27]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[28]  S. Battiston,et al.  A Climate Stress-Test of the Financial System , 2016 .