Kernel Canonical Correlation Analysis and its Applications to Nonlinear Measures of Association and Test of Independence ∗

Measures of association between two sets of random variables have long been of interest to statisticians. The classical canonical correlation analysis can characterize, but also be limited to, linear association. In this article we study nonlinear association measures using the kernel method. The introduction of kernel method from machine learning community has a great impact on statistical analysis. The kernel canonical correlation analysis (KCCA) is a method that generalizes the classical linear canonical correlation analysis to nonlinear setting. Such a generalization is nonparametric. It allows us to depict the nonlinear relation of two sets of variables and enables applications of classical multivariate data analysis originally constrained to linearity relation. Moreover, the kernel-based canonical correlation analysis no longer requires the Gaussian distributional assumption on observations, and therefore enhances greatly the applicability. The main purpose of this article is twofold. One is to link the KCCA emerging from the machine learning community to the nonlinear canonical analysis in statistical literature, and the other is to provide the KCCA some further statistical applications including association measures, dimension reduction and test of independence without the usual Gaussian assumption. Implementation algorithms will be discussed and several examples will be illustrated.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  M. S. Bartlett,et al.  The General Canonical Correlation Distribution , 1947 .

[3]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[4]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[5]  D. R. Jensen,et al.  Some Variational Results and Their Applications in Multiple Inference , 1977 .

[6]  J. Aubin,et al.  APPLIED FUNCTIONAL ANALYSIS , 1981, The Mathematical Gazette.

[7]  J. Dauxois,et al.  Comparison of two factor subspaces , 1993 .

[8]  J. Dauxois,et al.  Canonical analysis of two euclidean subspaces and its applications , 1997 .

[9]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[10]  J. Dauxois,et al.  Nonlinear canonical analysis and independence tests , 1998 .

[11]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[12]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[13]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[14]  B. Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, ICML.

[15]  Christopher K. I. Williams,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[16]  Su-Yun Huang,et al.  Incremental Reduced Support Vector Machines , 2001 .

[17]  Ralf Herbrich,et al.  Learning Kernel Classifiers , 2001 .

[18]  Yuh-Jye Lee,et al.  RSVM: Reduced Support Vector Machines , 2001, SDM.

[19]  Guy Martial Nkiet,et al.  Measures of Association for Hilbertian Subspaces and Some Applications , 2002 .

[20]  Malte Kuss,et al.  The Geometry Of Kernel Canonical Correlation Analysis , 2003 .

[21]  Alexander J. Smola,et al.  The kernel mutual information , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[22]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[23]  J. Dauxois,et al.  Canonical analysis relative to a closed subspace , 2004 .

[24]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[25]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[26]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[27]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[28]  Bernhard Schölkopf,et al.  Kernel Methods for Measuring Independence , 2005, J. Mach. Learn. Res..

[29]  Leon N. Cooper,et al.  Training Data Selection for Support Vector Machines , 2005, ICNC.

[30]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[31]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[32]  Shotaro Akaho,et al.  A kernel method for canonical correlation analysis , 2006, ArXiv.

[33]  Su-Yun Huang,et al.  Kernel Fisher Discriminant Analysis in Gaussian Reproducing Kernel Hilbert Spaces – Theory , 2006 .

[34]  Su-Yun Huang,et al.  Reduced Support Vector Machines: A Statistical Theory , 2007, IEEE Transactions on Neural Networks.

[35]  T. Hsing,et al.  Canonical correlation for stochastic processes , 2008 .