Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis

This paper considers the problem of canonical-correlation analysis (CCA) (Hotelling, 1936) and, more broadly, the generalized eigenvector problem for a pair of symmetric matrices. These are two fundamental problems in data analysis and scientific computing with numerous applications in machine learning and statistics (Shi and Malik, 2000; Hardoon et al., 2004; Witten et al., 2009). We provide simple iterative algorithms, with improved runtimes, for solving these problems that are globally linearly convergent with moderate dependencies on the condition numbers and eigenvalue gaps of the matrices involved. We obtain our results by reducing CCA to the top-$k$ generalized eigenvector problem. We solve this problem through a general framework that simply requires black box access to an approximate linear system solver. Instantiating this framework with accelerated gradient descent we obtain a running time of $O(\frac{z k \sqrt{\kappa}}{\rho} \log(1/\epsilon) \log \left(k\kappa/\rho\right))$ where $z$ is the total number of nonzero entries, $\kappa$ is the condition number and $\rho$ is the relative eigenvalue gap of the appropriate matrices. Our algorithm is linear in the input size and the number of components $k$ up to a $\log(k)$ factor. This is essential for handling large-scale matrices that appear in practice. To the best of our knowledge this is the first such algorithm with global linear convergence. We hope that our results prompt further research and ultimately improve the practical running time for performing these important data analysis procedures on large data sets.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[3]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[4]  T. Barnett,et al.  Origins and Levels of Monthly and Seasonal Forecast Skill for United States Surface Air Temperatures Determined by Canonical Correlation Analysis , 1987 .

[5]  A. Barnston,et al.  Prediction of ENSO Episodes Using Canonical Correlation Analysis , 1992 .

[6]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[7]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[8]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  H. Knutsson,et al.  Detection of neural activity in functional MRI using canonical correlation analysis , 2001, Magnetic resonance in medicine.

[10]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[11]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[12]  Alissa R Sherry,et al.  Conducting and Interpreting Canonical Correlation Analysis in Personality Research: A User-Friendly Primer , 2005, Journal of personality assessment.

[13]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[14]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[15]  Mark Tygert,et al.  A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[16]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[17]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[18]  M. Rudelson,et al.  Non-asymptotic theory of random matrices: extreme singular values , 2010, 1003.2990.

[19]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[20]  Dean P. Foster,et al.  Multi-View Learning of Word Embeddings via CCA , 2011, NIPS.

[21]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[22]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[23]  Dean P. Foster,et al.  Large Scale Canonical Correlation Analysis with Iterative Least Squares , 2014, NIPS.

[24]  Christos Boutsidis,et al.  Efficient Dimensionality Reduction for Canonical Correlation Analysis , 2012, SIAM J. Sci. Comput..

[25]  Paul Mineiro,et al.  Discriminative Features via Generalized Eigenvectors , 2013, ICML.

[26]  Cameron Musco,et al.  Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition , 2015, NIPS.

[27]  Saurabh Paul Core-Sets For Canonical Correlation Analysis , 2015, CIKM.

[28]  Dean P. Foster,et al.  Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis , 2015, ICML.

[29]  Sham M. Kakade,et al.  Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation , 2015, ArXiv.

[30]  Sham M. Kakade,et al.  Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization , 2015, ICML.

[31]  Nathan Srebro,et al.  Stochastic optimization for deep CCA via nonlinear orthogonal iterations , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[32]  Cameron Musco,et al.  Stronger Approximate Singular Value Decomposition via the Block Lanczos and Power Methods , 2015, ArXiv.

[33]  Zaïd Harchaoui,et al.  A Universal Catalyst for First-Order Optimization , 2015, NIPS.

[34]  Elad Hazan,et al.  Fast and Simple PCA via Convex Optimization , 2015, ArXiv.

[35]  Karen Livescu,et al.  Nonparametric Canonical Correlation Analysis , 2015, ICML.

[36]  Efficient Algorithms for Large-scale Generalized Eigenvector Computation and CCA , 2016 .

[37]  Karen Livescu,et al.  Large-Scale Approximate Kernel Canonical Correlation Analysis , 2015, ICLR.

[38]  Nathan Srebro,et al.  Globally Convergent Stochastic Optimization for Canonical Correlation Analysis , 2016, ArXiv.

[39]  Nathan Srebro,et al.  Efficient Globally Convergent Stochastic Optimization for Canonical Correlation Analysis , 2016, NIPS.