Sparse additive discriminant canonical correlation analysis for multiple features fusion

Abstract Canonical correlation analysis (CCA) is an unsupervised representation learning technique to correlate multi-view data by learning a set of projection matrices. Being complementary with CCA, many discriminant methods are proposed to extract discriminative features of multi-view data by introducing the supervised class information. However, the learned projection matrices in these methods are mathematically constrained to be equal rank to the class number, and thus cannot represent the original data comprehensively. In this paper, we propose a general multi-view information fusion technique, named sparse additive discriminative canonical correlation analysis (SaDCCA). On one hand, SaDCCA is equipped with a strong degree of discrimination by defining a new affinity matrix that reflects the high-order characteristics of intra-class and the separability of inter-class. On the other hand, SaDCCA can exploit the correlation among multi-view data by maintaining the spirit of CCA. The discrimination among classes and the correlation among views are integrated in an additive manner. To obtain the sparse solutions, we first establish the relationship between the objective function and the underdetermined linear system equations, and then obtain the l 1 -norm solution by accelerated Bregman iteration with matrix form. SaDCCA has no rank constraint on the projection matrices and is capable to provide accurate recognition performance. Experiments conducted on some publicly available datasets demonstrate the effectiveness of the proposed approach.

[1]  Alan F. Smeaton,et al.  Investigating keyframe selection methods in the novel domain of passively captured visual lifelogs , 2008, CIVR '08.

[2]  Jing-Yu Yang,et al.  Face recognition based on the uncorrelated discriminant transformation , 2001, Pattern Recognit..

[3]  Björn W. Schuller,et al.  The INTERSPEECH 2010 paralinguistic challenge , 2010, INTERSPEECH.

[4]  Zhenwen Ren,et al.  Multiple kernel dimensionality reduction based on linear regression virtual reconstruction for image set classification , 2019, Neurocomputing.

[5]  D. Donoho,et al.  Atomic Decomposition by Basis Pursuit , 2001 .

[6]  Ivica Kopriva,et al.  Multi-view low-rank sparse subspace clustering , 2017, Pattern Recognit..

[7]  Shiqian Ma,et al.  Accelerated Linearized Bregman Method , 2011, J. Sci. Comput..

[8]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[9]  Yuhong Guo,et al.  Convex Subspace Representation Learning from Multi-View Data , 2013, AAAI.

[10]  Xuelong Li,et al.  Parameter-Free Auto-Weighted Multiple Graph Learning: A Framework for Multiview Clustering and Semi-Supervised Classification , 2016, IJCAI.

[11]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Jieping Ye,et al.  Null space versus orthogonal linear discriminant analysis , 2006, ICML '06.

[13]  Longin Jan Latecki,et al.  Affinity Learning with Diffusion on Tensor Product Graph , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Michael K. Ng,et al.  Sparse Canonical Correlation Analysis: New Formulation and Algorithm , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Qi Tian,et al.  Multiview Hessian Semisupervised Sparse Feature Selection for Multimedia Analysis , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Xiaohua Zhai,et al.  Semi-Supervised Cross-Media Feature Learning With Unified Patch Graph Regularization , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Yong Luo,et al.  Tensor Canonical Correlation Analysis for Multi-View Dimension Reduction , 2015, IEEE Trans. Knowl. Data Eng..

[19]  Zheru Chi,et al.  Facial Expression Recognition in Video with Multiple Feature Fusion , 2018, IEEE Transactions on Affective Computing.

[20]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[21]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Dong Yue,et al.  Multi-view low-rank dictionary learning for image classification , 2016, Pattern Recognit..

[23]  Lin Wu,et al.  Multiview Spectral Clustering via Structured Low-Rank Matrix Factorization , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Allan Aasbjerg Nielsen,et al.  Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data , 2002, IEEE Trans. Image Process..

[25]  Jian-Feng Cai,et al.  Convergence of the linearized Bregman iteration for ℓ1-norm minimization , 2009, Math. Comput..

[26]  Nebojsa Jojic,et al.  LOCUS: learning object classes with unsupervised segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Qi Tian,et al.  Semi-supervised feature selection analysis with structured multi-view sparse regularization , 2019, Neurocomputing.

[28]  John Shawe-Taylor,et al.  Sparse canonical correlation analysis , 2009, Machine Learning.

[29]  Qingming Huang,et al.  Multi-modal semantic autoencoder for cross-modal retrieval , 2019, Neurocomputing.

[30]  Pengfei Shi,et al.  A Novel Method of Combined Feature Extraction for Recognition , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[31]  Wotao Yin,et al.  Bregman Iterative Algorithms for (cid:2) 1 -Minimization with Applications to Compressed Sensing ∗ , 2008 .

[32]  Shiguang Shan,et al.  Multi-view Deep Network for Cross-View Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Songcan Chen,et al.  Locality preserving CCA with applications to data visualization and pose estimation , 2007, Image Vis. Comput..

[35]  Ioannis Pitas,et al.  The eNTERFACE’05 Audio-Visual Emotion Database , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[36]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[37]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[38]  Stephen J. Wright,et al.  Sparse reconstruction by separable approximation , 2009, IEEE Trans. Signal Process..

[39]  Shiliang Sun,et al.  Multiview Uncorrelated Discriminant Analysis , 2016, IEEE Transactions on Cybernetics.

[40]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[41]  Ling Guan,et al.  Recognizing Human Emotional State From Audiovisual Signals* , 2008, IEEE Transactions on Multimedia.

[42]  Lei Gao,et al.  Discriminative Multiple Canonical Correlation Analysis for Information Fusion , 2018, IEEE Transactions on Image Processing.

[43]  H. Hotelling Relations Between Two Sets of Variates , 1936 .