Discriminative Feature Extraction by a Neural Implementation of Canonical Correlation Analysis

The canonical correlation analysis (CCA) aims at measuring linear relationships between two sets of variables (views) that can be used for feature extraction in classification problems with multiview data. However, the correlated features extracted by the CCA may not be class discriminative, since CCA does not utilize the class labels in its traditional formulation. Although there is a method called discriminative CCA (DCCA) that aims to increase the discriminative ability of CCA inspired from the linear discriminant analysis (LDA), it has been shown that the extracted features with this method are identical to those by the LDA with respect to an orthogonal transformation. Therefore, DCCA is simply equivalent to applying single-view (regular) LDA to each one of the views separately. Besides, DCCA and the other similar DCCA approaches have generalization problems due to the sample covariance matrices used in their computation, which are sensitive to outliers and noisy samples. In this paper, we propose a method, called discriminative alternating regression (D-AR), to explore correlated and also discriminative features. D-AR utilizes two (alternating) multilayer perceptrons, each with a linear hidden layer, learning to predict both the class labels and the outputs of each other. We show that the features found by D-AR on training sets significantly accomplish higher classification accuracies on test sets of facial expression recognition, object recognition, and image retrieval experimental data sets.

[1]  Asaf Degani,et al.  Canonical Correlation Analysis: Use of Composite Heliographs for Representing Multiple Patterns , 2006, Diagrams.

[2]  Josef Kittler,et al.  Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ning Chen,et al.  Predictive Subspace Learning for Multi-view Data: a Large Margin Approach , 2010, NIPS.

[4]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[5]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[6]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[7]  Olcay Kursun,et al.  Feature Selection and Extraction Using an Unsupervised Biologically-Suggested Approximation to Gebelein's Maximal Correlation , 2010, Int. J. Pattern Recognit. Artif. Intell..

[8]  Calyampudi R. Rao The use and interpretation of principal component analysis in applied research , 1964 .

[9]  A Robust Biplot Representation of Two-way Tables , 1998 .

[10]  Michael I. Jordan,et al.  A Probabilistic Interpretation of Canonical Correlation Analysis , 2005 .

[11]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[12]  Stan Z. Li,et al.  2D–3D face matching using CCA , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[13]  Fuchun Sun,et al.  Large-Margin Predictive Latent Subspace Learning for Multiview Data Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Fikret S. Gürgen,et al.  Ensemble canonical correlation analysis , 2013, Applied Intelligence.

[15]  Jianyong Sun,et al.  Canonical Correlation Analysis on Data With Censoring and Error Information , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Çigdem Eroglu Erdem,et al.  Feature extraction for facial expression recognition by canonical correlation analysis , 2012, 2012 20th Signal Processing and Communications Applications Conference (SIU).

[17]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[18]  M. Romanazzi Influence in canonical correlation analysis , 1992 .

[19]  B. Thompson Canonical Correlation Analysis: Uses and Interpretation , 1984 .

[20]  Peter Filzmoser,et al.  Robust canonical correlations: A comparative study , 2005, Comput. Stat..

[21]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[22]  W. Zheng,et al.  Facial expression recognition using kernel canonical correlation analysis (KCCA) , 2006, IEEE Transactions on Neural Networks.

[23]  Steven C. H. Hoi,et al.  Multiview Semi-Supervised Learning with Consensus , 2012, IEEE Transactions on Knowledge and Data Engineering.

[24]  Quan-Sen Sun,et al.  Multiset Canonical Correlations Using Globality Preserving Projections With Applications to Feature Extraction and Recognition , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Olcay Kursun,et al.  A method for combining mutual information and canonical correlation analysis: Predictive Mutual Information and its use in feature selection , 2012, Expert Syst. Appl..

[26]  Oleg V. Favorov,et al.  SINBAD: A neocortical mechanism for discovering environmental variables and regularities hidden in sensory input , 2004, Biological Cybernetics.

[27]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[28]  David R. Hardoon,et al.  KCCA Feature Selection for fMRI Analysis , 2004 .

[29]  Sezer Ulukaya AFFECT RECOGNITION FROM FACIAL EXPRESSIONS FOR HUMAN - COMPUTER INTERACTION , 2011 .

[30]  D. Tritchler,et al.  Sparse Canonical Correlation Analysis with Application to Genomic Data Integration , 2009, Statistical applications in genetics and molecular biology.

[31]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Cairong Zou,et al.  Face Recognition Based on PCA/KPCA Plus CCA , 2005, ICNC.

[33]  William W. Hsieh,et al.  Nonlinear canonical correlation analysis by neural networks , 2000, Neural Networks.

[34]  Colin Fyfe,et al.  Canonical correlation analysis using artificial neural networks , 1998, ESANN.

[35]  Cheong Hee Park,et al.  Analysis of correlation based dimension reduction methods , 2011, Int. J. Appl. Math. Comput. Sci..

[36]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[37]  Yan Liu,et al.  A new method of feature fusion and its application in image recognition , 2005, Pattern Recognit..

[38]  P. Földiák,et al.  Forming sparse representations by local anti-Hebbian learning , 1990, Biological Cybernetics.

[39]  Samuel Kaski,et al.  Local dependent components , 2007, ICML '07.

[40]  Fikret Gürgen,et al.  Feature Extraction Based on Discriminative Alternating Regression , 2014 .

[41]  Mahmood R. Azimi-Sadjadi,et al.  A network for recursive extraction of canonical coordinates , 2003, Neural Networks.

[42]  Michel Verleysen,et al.  Robust probabilistic projections , 2006, ICML.

[43]  Songcan Chen,et al.  A Supervised Combined Feature Extraction Method for Recognition , 2008 .

[44]  Oleg V. Favorov,et al.  Using covariates for improving the minimum redundancy maximum relevance feature selection method , 2010 .

[45]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[46]  Hua Huang,et al.  Super-Resolution Method for Face Recognition Using Nonlinear Mappings on Coherent Features , 2011, IEEE Transactions on Neural Networks.

[47]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[48]  Hans-Hermann Bock,et al.  Data Science and Classification (Studies in Classification, Data Analysis, and Knowledge Organization) , 2006 .

[49]  Ognjen Arandjelovic Discriminative extended canonical correlation analysis for pattern set matching , 2013, Machine Learning.

[50]  M. Bartlett Further aspects of the theory of multiple regression , 1938, Mathematical Proceedings of the Cambridge Philosophical Society.

[51]  Colin Fyfe,et al.  A neural implementation of canonical correlation analysis , 1999, Neural Networks.

[52]  Francis R. Bach,et al.  Sparse probabilistic projections , 2008, NIPS.

[53]  Olcay Kursun,et al.  A Hybrid Method for Feature Selection Based on Mutual Information and Canonical Correlation Analysis , 2010, 2010 20th International Conference on Pattern Recognition.