A low variance consistent test of relative dependency

We describe a novel non-parametric statistical hypothesis test of relative dependence between a source variable and two candidate target variables. Such a test enables us to determine whether one source variable is significantly more dependent on a first target variable or a second. Dependence is measured via the Hilbert-Schmidt Independence Criterion (HSIC), resulting in a pair of empirical dependence measures (source-target 1, source-target 2). We test whether the first dependence measure is significantly larger than the second. Modeling the covariance between these HSIC statistics leads to a provably more powerful test than the construction of independent HSIC statistics by sub-sampling. The resulting test is consistent and unbiased, and (being based on U-statistics) has favorable convergence properties. The test can be computed in quadratic time, matching the computational complexity of standard empirical HSIC estimators. The effectiveness of the test is demonstrated on several real-world problems: we identify language groups from a multilingual corpus, and we prove that tumor location is more dependent on gene expression than chromosomal imbalances. Source code is available for download at this https URL

[1]  Le Song,et al.  Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[2]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[3]  Simon J. Greenhill,et al.  Mapping the Origins and Expansion of the Indo-European Language Family , 2012, Science.

[4]  Mehryar Mohri,et al.  Learning Non-Linear Combinations of Kernels , 2009, NIPS.

[5]  R. Heller,et al.  A consistent multivariate test of association based on ranks of distances , 2012, 1201.3522.

[6]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[7]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[8]  Steve R. Gunn,et al.  Structural Modelling with Sparse Kernels , 2002, Machine Learning.

[9]  J. Bring A Geometric Approach to Compare Variables in a Regression Model , 1996 .

[10]  R. Gray,et al.  Language-tree divergence times support the Anatolian theory of Indo-European origin , 2003, Nature.

[11]  A. Gretton A simpler condition for consistency of a kernel independence test , 2015, 1501.06103.

[12]  R. Darlington,et al.  Multiple regression in psychological research and practice. , 1968, Psychological bulletin.

[13]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[14]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[15]  Konrad Paul Kording,et al.  Sensory Cue Integration , 2011 .

[16]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[17]  P. Varlet,et al.  Mesenchymal Transition and PDGFRA Amplification/Mutation Are Key Distinct Oncogenic Events in Pediatric Diffuse Intrinsic Pontine Gliomas , 2012, PloS one.

[18]  Arthur Gretton,et al.  Consistent Nonparametric Tests of Independence , 2010, J. Mach. Learn. Res..

[19]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[20]  Arthur Gretton,et al.  A Kernel Test for Three-Variable Interactions , 2013, NIPS.

[21]  J. Dauxois,et al.  Nonlinear canonical analysis and independence tests , 1998 .

[22]  Kenji Fukumizu,et al.  Equivalence of distance-based and RKHS-based statistics in hypothesis testing , 2012, ArXiv.

[23]  Bernhard Schölkopf,et al.  Kernel Methods for Measuring Independence , 2005, J. Mach. Learn. Res..

[24]  Kenji Fukumizu,et al.  Statistical Consistency of Kernel Canonical Correlation Analysis , 2007 .

[25]  Muni S. Srivastava,et al.  Regression Analysis: Theory, Methods, and Applications , 1991 .

[26]  Paul D. Clough,et al.  Multilingual Information Retrieval: From Research To Practice , 2012 .

[27]  J. Kinney,et al.  Equitability, mutual information, and the maximal information coefficient , 2013, Proceedings of the National Academy of Sciences.

[28]  W. Hoeffding A Class of Statistics with Asymptotically Normal Distribution , 1948 .

[29]  F. Scaravilli,et al.  Expression profiling of ependymomas unravels localization and tumor grade‐specific tumorigenesis , 2009, Cancer.

[30]  R. Gilbertson,et al.  Tumorigenesis in the brain: location, location, location. , 2007, Cancer research.

[31]  B. V. Bahr On the Convergence of Moments in the Central Limit Theorem , 1965 .

[32]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[33]  E. Giné,et al.  Limit Theorems for $U$-Processes , 1993 .

[34]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[35]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[36]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.