Canonical Correlation Analysis based on Hilbert-Schmidt Independence Criterion and Centered Kernel Target Alignment

Canonical correlation analysis (CCA) is a well established technique for identifying linear relationships among two variable sets. Kernel CCA (KCCA) is the most notable nonlinear extension but it lacks interpretability and robustness against irrelevant features. The aim of this article is to introduce two nonlinear CCA extensions that rely on the recently proposed Hilbert-Schmidt independence criterion and the centered kernel target alignment. These extensions determine linear projections that provide maximally dependent projected data pairs. The paper demonstrates that the use of linear projections allows removing irrelevant features, whilst extracting combinations of strongly associated features. This is exemplified through a simulation and the analysis of recorded data that are available in the literature.

[1]  W. Kruskal Ordinal Measures of Association , 1958 .

[2]  Daniela M Witten,et al.  Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data , 2009, Statistical applications in genetics and molecular biology.

[3]  J. Lafferty,et al.  Sparse additive models , 2007, 0711.4555.

[4]  Michael I. Jordan,et al.  Unsupervised Kernel Dimension Reduction , 2010, NIPS.

[5]  Sivaraman Balakrishnan,et al.  Sparse Additive Functional and Kernel CCA , 2012, ICML.

[6]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[7]  Michael I. Jordan,et al.  Kernel dimension reduction in regression , 2009, 0908.1854.

[8]  Ignacio González,et al.  integrOmics: an R package to unravel relationships between two omics datasets , 2009, Bioinform..

[9]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[10]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[11]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[12]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[13]  Jason Weston,et al.  Large-scale kernel machines , 2007 .

[14]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[15]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[16]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[17]  G. Irwin,et al.  Deflation based nonlinear canonical correlation analysis , 2006 .

[18]  William W. Hsieh,et al.  Nonlinear canonical correlation analysis by neural networks , 2000, Neural Networks.

[19]  Le Song,et al.  Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[20]  Kenji Fukumizu,et al.  Statistical Consistency of Kernel Canonical Correlation Analysis , 2007 .

[21]  Coryn A. L. Bailer-Jones,et al.  The ILIUM forward modelling algorithm for multivariate parameter estimation and its application to derive stellar parameters from Gaia spectrophotometry , 2009, ArXiv.

[22]  D. Tritchler,et al.  Sparse Canonical Correlation Analysis with Application to Genomic Data Integration , 2009, Statistical applications in genetics and molecular biology.

[23]  John Shawe-Taylor,et al.  Sparse canonical correlation analysis , 2009, Machine Learning.

[24]  Le Song,et al.  Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.

[25]  Hao Shen,et al.  Fast Kernel-Based Independent Component Analysis , 2009, IEEE Transactions on Signal Processing.