Deep Generalized Canonical Correlation Analysis

We present Deep Generalized Canonical Correlation Analysis (DGCCA) -- a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other. While methods for nonlinear two-view representation learning (Deep CCA, (Andrew et al., 2013)) and linear many-view representation learning (Generalized CCA (Horst, 1961)) exist, DGCCA is the first CCA-style multiview representation learning technique that combines the flexibility of nonlinear (deep) representation learning with the statistical power of incorporating information from many independent sources, or views. We present the DGCCA formulation as well as an efficient stochastic optimization algorithm for solving it. We learn DGCCA representations on two distinct datasets for three downstream tasks: phonetic transcription from acoustic and articulatory measurements, and recommending hashtags and friends on a dataset of Twitter users. We find that DGCCA representations soundly beat existing methods at phonetic transcription and hashtag recommendation, and in general perform no worse than standard linear many-view techniques.

[1]  Jeff A. Bilmes,et al.  Unsupervised learning of acoustic features via deep canonical correlation analysis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Nathan Srebro,et al.  Stochastic optimization for deep CCA via nonlinear orthogonal iterations , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[3]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Jürgen Schmidhuber,et al.  Multimodal Similarity-Preserving Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[6]  Mark Dredze,et al.  Learning Multiview Embeddings of Twitter Users , 2016, ACL.

[7]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[9]  Balaraman Ravindran,et al.  Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning , 2015, NAACL.

[10]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[11]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[12]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[13]  P. Horst Generalized canonical correlations and their applications to experimental data. , 1961, Journal of clinical psychology.

[14]  Xiaowen Dong,et al.  Multi-View Signal Processing and Learning on Graphs , 2014 .

[15]  Sham M. Kakade,et al.  An Information Theoretic Framework for Multi-view Learning , 2008, COLT.

[16]  J. Kettenring,et al.  Canonical Analysis of Several Sets of Variables , 2022 .

[17]  Hugo Larochelle,et al.  Correlational Neural Networks , 2015, Neural Computation.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[20]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[21]  Raman Arora,et al.  Multi-view learning with supervision for transformed bottleneck features , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Raymond D. Kent,et al.  X‐ray microbeam speech production database , 1990 .

[23]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.