Sufficient Canonical Correlation Analysis

Canonical correlation analysis (CCA) is an effective way to find two appropriate subspaces in which Pearson's correlation coefficients are maximized between projected random vectors. Due to its well-established theoretical support and relatively efficient computation, CCA is widely used as a joint dimension reduction tool and has been successfully applied to many image processing and computer vision tasks. However, as reported, the traditional CCA suffers from overfitting in many practical cases. In this paper, we propose sufficient CCA (S-CCA) to relieve CCA's overfitting problem, which is inspired by the theory of sufficient dimension reduction. The effectiveness of S-CCA is verified both theoretically and experimentally. Experimental results also demonstrate that our S-CCA outperforms some of CCA's popular extensions during the prediction phase, especially when severe overfitting occurs.

[1]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[2]  Chong Wang,et al.  Variational Bayesian Approach to Canonical Correlation Analysis , 2007, IEEE Transactions on Neural Networks.

[3]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[4]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[5]  V. Yurinsky Sums and Gaussian Vectors , 1995 .

[6]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[7]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[8]  R. H. Moore,et al.  Regression Graphics: Ideas for Studying Regressions Through Graphics , 1998, Technometrics.

[9]  Daoqiang Zhang,et al.  A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples , 2011, Pattern Recognit..

[10]  Francis R. Bach,et al.  Sparse probabilistic projections , 2008, NIPS.

[11]  L. Bailey The Kalman Filter , 2010 .

[12]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[13]  Bing Li,et al.  A general theory for nonlinear sufficient dimension reduction: Formulation and estimation , 2013, 1304.0580.

[14]  Samuel Kaski,et al.  Bayesian CCA via Group Sparsity , 2011, ICML.

[15]  Horst Bischof,et al.  Fast Active Appearance Model Search Using Canonical Correlation Analysis , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  S. Shan,et al.  Maximizing intra-individual correlations for face recognition across pose differences , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[18]  D. Jacobs,et al.  Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch , 2011, CVPR 2011.

[19]  Steve R. Gunn,et al.  Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[20]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[21]  Samuel Kaski,et al.  Bayesian Canonical correlation analysis , 2013, J. Mach. Learn. Res..

[22]  Allan Aasbjerg Nielsen,et al.  Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data , 2002, IEEE Trans. Image Process..

[23]  K. Fukumizu,et al.  Gradient-Based Kernel Dimension Reduction for Regression , 2014 .

[24]  Fikret S. Gürgen,et al.  Ensemble canonical correlation analysis , 2013, Applied Intelligence.

[25]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Michael I. Jordan,et al.  A Probabilistic Interpretation of Canonical Correlation Analysis , 2005 .

[27]  Samuel Kaski,et al.  Dependency detection with similarity constraints , 2009, 2009 IEEE International Workshop on Machine Learning for Signal Processing.

[28]  Tong Zhang,et al.  Learning Bounds for Kernel Regression Using Effective Data Dimensionality , 2005, Neural Computation.

[29]  Hal Daumé,et al.  Multi-Label Prediction via Sparse Infinite CCA , 2009, NIPS.

[30]  Yu-Chiang Frank Wang,et al.  Heterogeneous Domain Adaptation and Classification by Exploiting the Correlation Subspace , 2014, IEEE Transactions on Image Processing.

[31]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[32]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[34]  Hanqing Lu,et al.  A nonlinear approach for face sketch synthesis and recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[35]  Tae-Kyun Kim,et al.  Canonical Correlation Analysis of Video Volume Tensors for Action Categorization and Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Amit R.Sharma,et al.  Face Photo-Sketch Synthesis and Recognition , 2012 .

[37]  Dean P. Foster,et al.  Large Scale Canonical Correlation Analysis with Iterative Least Squares , 2014, NIPS.

[38]  Gene H. Golub,et al.  Numerical methods for computing angles between linear subspaces , 1971, Milestones in Matrix Computation.

[39]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[40]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[41]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[42]  Xiaogang Wang,et al.  Face sketch recognition , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[43]  Samuel Kaski,et al.  Local dependent components , 2007, ICML '07.