Generalized canonical correlation analysis for classification

For multiple multivariate datasets, we derive conditions under which Generalized Canonical Correlation Analysis improves classification performance of the projected datasets, compared to standard Canonical Correlation Analysis using only two data sets. We illustrate our theoretical results with simulations and a real data experiment.

[1]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[2]  David G. Stork,et al.  Pattern Classification , 1973 .

[3]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  John Shawe-Taylor,et al.  Sparse canonical correlation analysis , 2009, Machine Learning.

[5]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  Chong-sun Kim Canonical Analysis of Several Sets of Variables , 1973 .

[7]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[8]  David C. Hoyle,et al.  Automatic PCA Dimension Selection for High Dimensional Data and Small Sample Sizes , 2008 .

[9]  Carey E. Priebe,et al.  Generalized Canonical Correlation Analysis for Disparate Data Fusion , 2013, Pattern Recognit. Lett..

[10]  Li Qiu,et al.  Unitarily Invariant Metrics on the Grassmann Space , 2005, SIAM J. Matrix Anal. Appl..

[11]  Heungsun Hwang,et al.  Functional Multiple-Set Canonical Correlation Analysis , 2012 .

[12]  I. Jolliffe Principal Component Analysis , 2002 .

[13]  P. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 1999 .

[14]  Trevor Hastie,et al.  Support Vector Machines , 2013 .

[15]  Jane-Ling Wang,et al.  Functional canonical analysis for square integrable stochastic processes , 2003 .

[16]  Jian Yang,et al.  Why can LDA be performed in PCA transformed space? , 2003, Pattern Recognit..

[17]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[18]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[19]  R. Tibshirani,et al.  Penalized classification using Fisher's linear discriminant , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[20]  Sivaraman Balakrishnan,et al.  Sparse Additive Functional and Kernel CCA , 2012, ICML.

[21]  Mu Zhu,et al.  Automatic dimensionality selection from the scree plot via the use of profile likelihood , 2006, Comput. Stat. Data Anal..

[22]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[23]  R. Vershynin How Close is the Sample Covariance Matrix to the Actual Covariance Matrix? , 2010, 1004.3484.

[24]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[25]  W. Torgerson Multidimensional scaling: I. Theory and method , 1952 .

[26]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[27]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[28]  R. Vershynin,et al.  Covariance estimation for distributions with 2+ε moments , 2011, 1106.2775.

[29]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[30]  Glenn Fung,et al.  A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[31]  Y. Chikuse Statistics on special manifolds , 2003 .

[32]  Janaina Mourão Miranda,et al.  Unsupervised analysis of fMRI data using kernel canonical correlation , 2007, NeuroImage.

[33]  A. Tenenhaus,et al.  Regularized Generalized Canonical Correlation Analysis , 2011, Eur. J. Oper. Res..