Gen-Oja: A Two-time-scale approach for Streaming CCA

In this paper, we study the problems of principal Generalized Eigenvector computation and Canonical Correlation Analysis in the stochastic setting. We propose a simple and efficient algorithm, Gen-Oja, for these problems. We prove the global convergence of our algorithm, borrowing ideas from the theory of fast-mixing Markov chains and two-time-scale stochastic approximation, showing that it achieves the optimal rate of convergence. In the process, we develop tools for understanding stochastic processes with Markovian noise which might be of independent interest.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  Persi Diaconis,et al.  Iterated Random Functions , 1999, SIAM Rev..

[3]  Cameron Musco,et al.  Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition , 2015, NIPS.

[4]  Nathan Srebro,et al.  Stochastic Approximation for Canonical Correlation Analysis , 2017, NIPS.

[5]  Dean P. Foster,et al.  Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis , 2015, ICML.

[6]  Chao Gao,et al.  Stochastic Canonical Correlation Analysis , 2017, J. Mach. Learn. Res..

[7]  F. Bach,et al.  Bridging the gap between constant step size stochastic gradient descent and Markov chains , 2017, The Annals of Statistics.

[8]  Sham M. Kakade,et al.  Faster Eigenvector Computation via Shift-and-Invert Preconditioning , 2016, ICML.

[9]  V. Borkar Stochastic approximation with two time scales , 1997 .

[10]  Ohad Shamir,et al.  Convergence of Stochastic Gradient Descent for PCA , 2015, ICML.

[11]  Paul Mineiro,et al.  Discriminative Features via Generalized Eigenvectors , 2013, ICML.

[12]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[13]  Dean P. Foster,et al.  Large Scale Canonical Correlation Analysis with Iterative Least Squares , 2014, NIPS.

[14]  Moritz Hardt,et al.  The Noisy Power Method: A Meta Algorithm with Applications , 2013, NIPS.

[15]  Eric Moulines,et al.  Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.

[16]  David Steinsaltz,et al.  Locally Contractive Iterated Function Systems , 1999 .

[17]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[18]  Prateek Jain,et al.  Streaming PCA: Matching Matrix Bernstein and Near-Optimal Finite Sample Guarantees for Oja's Algorithm , 2016, COLT.

[19]  Michael I. Jordan,et al.  Averaging Stochastic Gradient Descent on Riemannian Manifolds , 2018, COLT.

[20]  Sham M. Kakade,et al.  Efficient Algorithms for Large-scale Generalized Eigenvector Computation and Canonical Correlation Analysis , 2016, ICML.

[21]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[22]  H. Robbins A Stochastic Approximation Method , 1951 .

[23]  Nathan Srebro,et al.  Efficient Globally Convergent Stochastic Optimization for Canonical Correlation Analysis , 2016, NIPS.

[24]  Harrison H. Zhou,et al.  Sparse CCA: Adaptive Estimation and Computational Barriers , 2014, 1409.8565.