Scalable and flexible Max-Var generalized canonical correlation analysis via alternating optimization

Unlike dimensionality reduction (DR) tools for single-view data, e.g., principal component analysis (PCA), canonical correlation analysis (CCA) and generalized CCA (GCCA) are able to integrate information from multiple feature spaces of data. This is critical in multi-modal data fusion and analytics, where samples from a single view may not be enough for meaningful DR. In this work, we focus on a popular formulation of GCCA, namely, MAX-VAR GCCA. The classic MAX-VAR problem is optimally solvable via eigen-decomposition, but this solution has serious scalability issues. In addition, how to impose regularizers on the sought canonical components was unclear - while structure-promoting regularizers are often desired in practice. We propose an algorithm that can easily handle datasets whose sample and feature dimensions are both large by exploiting data sparsity. The algorithm is also flexible in incorporating regularizers on the canonical components. Convergence properties of the proposed algorithm are carefully analyzed. Numerical experiments are presented to showcase its effectiveness.

[1]  Tom Michael Mitchell,et al.  Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.

[2]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[3]  Bin Ma,et al.  Acoustic Segment Modeling with Spectral Clustering Methods , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4]  Tammo H. A. Bijmolt,et al.  Generalized canonical correlation analysis of matrices with missing rows: a simulation study , 2006, Psychometrika.

[5]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[6]  W. Marsden I and J , 2012 .

[7]  Marc Moonen,et al.  Distributed Canonical Correlation Analysis in Wireless Sensor Networks With Application to Distributed Blind Source Separation , 2015, IEEE Transactions on Signal Processing.

[8]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[9]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[10]  Vince D. Calhoun,et al.  Joint Blind Source Separation by Multiset Canonical Correlation Analysis , 2009, IEEE Transactions on Signal Processing.

[11]  Xi Chen,et al.  Structured Sparse Canonical Correlation Analysis , 2012, AISTATS.

[12]  Gene H. Golub,et al.  Matrix computations , 1983 .

[13]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[14]  Indrayana Rustandi,et al.  Integrating Multiple-Study Multiple-Subject fMRI Datasets Using Canonical Correlation Analysis , 2009 .

[15]  Michel van de Velden ON GENERALIZED CANONICAL CORRELATION ANALYSIS , 2011 .

[16]  J. Kettenring,et al.  Canonical Analysis of Several Sets of Variables , 2022 .

[17]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[18]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Joachim M. Buhmann,et al.  Time-series alignment by non-negative multiple generalized canonical correlation analysis , 2007, BMC Bioinformatics.

[20]  Wotao Yin,et al.  A Globally Convergent Algorithm for Nonconvex Optimization Based on Block Coordinate Update , 2014, J. Sci. Comput..

[21]  Petar M. Djuric,et al.  Distributed Bayesian learning in multiagent systems: Improving our understanding of its capabilities and limitations , 2012, IEEE Signal Processing Magazine.

[22]  C. Sigg,et al.  Nonnegative CCA for Audiovisual Source Separation , 2007, 2007 IEEE Workshop on Machine Learning for Signal Processing.

[23]  Benjamin Van Durme,et al.  Multiview LSA: Representation Learning via Generalized CCA , 2015, NAACL.

[24]  Dean P. Foster,et al.  Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis , 2015, ICML.

[25]  Aleksandar Dogandzic,et al.  Finite-length MIMO equalization using canonical correlation analysis , 2002, IEEE Trans. Signal Process..

[26]  Daniela M Witten,et al.  Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data , 2009, Statistical applications in genetics and molecular biology.

[27]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[28]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[29]  Dean P. Foster,et al.  Large Scale Canonical Correlation Analysis with Iterative Least Squares , 2014, NIPS.

[30]  Raman Arora,et al.  Multi-view learning with supervision for transformed bottleneck features , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  John Shawe-Taylor,et al.  A Comparison of Relaxations of Multiset Cannonical Correlation Analysis and Applications , 2013, ArXiv.

[32]  Joachim M. Buhmann,et al.  Time-Series Alignment by Non-negative Multiple Generalized Canonical Correlation Analysis , 2007, WILF.

[33]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.