Unsupervised multi-view feature extraction with dynamic graph learning

Abstract Graph-based multi-view feature extraction has attracted much attention in literature. However, conventional solutions generally rely on a manually defined affinity graph matrix, which is hard to capture the intrinsic sample relations in multiple views. In addition, the graph construction and feature extraction are separated into two independent processes which may result in sub-optimal results. Furthermore, the raw data may contain adverse noises that reduces the reliability of the affinity matrix. In this paper, we propose a novel Unsupervised Multi-view Feature Extraction with Dynamic Graph Learning (UMFE-DGL) to solve these limitations. We devise a unified learning framework which simultaneously performs dynamic graph learning and the feature extraction. Dynamic graph learning adaptively captures the intrinsic multiple view-specific relations of samples. Feature extraction learns the projection matrix that could accordingly preserve the dynamically adjusted sample relations modelled by graph into the low-dimensional features. Experimental results on several public datasets demonstrate the superior performance of the proposed approach, compared with state-of-the-art techniques.

[1]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[2]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[3]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[4]  Jane You,et al.  Robust Dual Clustering with Adaptive Manifold Regularization , 2017, IEEE Transactions on Knowledge and Data Engineering.

[5]  Heng Tao Shen,et al.  Hashing with Angular Reconstructive Embeddings , 2018, IEEE Transactions on Image Processing.

[6]  D. O'Shaughnessy,et al.  Linear predictive coding , 1988, IEEE Potentials.

[7]  Feiping Nie,et al.  Unsupervised Single and Multiple Views Feature Extraction with Structured Graph , 2017, IEEE Transactions on Knowledge and Data Engineering.

[8]  Yongdong Zhang,et al.  Multiview Spectral Embedding , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[10]  Wei Liu,et al.  Asymmetric Binary Coding for Image Search , 2017, IEEE Transactions on Multimedia.

[11]  Dianfu Ma,et al.  Multiview Locally Linear Embedding for Effective Medical Image Retrieval , 2013, PloS one.

[12]  Jing Zhang,et al.  Low-Rank Regularized Heterogeneous Tensor Decomposition for Subspace Clustering , 2018, IEEE Signal Processing Letters.

[13]  Meng Wang,et al.  A Framework of Joint Low-Rank and Sparse Regression for Image Memorability Prediction , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[15]  Hai Jin,et al.  Landmark Classification With Hierarchical Multi-Modal Exemplar Feature , 2015, IEEE Transactions on Multimedia.

[16]  Zi Huang,et al.  Discrete Multimodal Hashing With Canonical Views for Robust Mobile Landmark Search , 2017, IEEE Transactions on Multimedia.

[17]  Meng Wang,et al.  Low-Rank Multi-View Embedding Learning for Micro-Video Popularity Prediction , 2018, IEEE Transactions on Knowledge and Data Engineering.

[18]  Liqiang Nie,et al.  Predicting Image Memorability Through Adaptive Transfer Learning From External Sources , 2017, IEEE Transactions on Multimedia.

[19]  Nicu Sebe,et al.  Flexible Manifold Learning With Optimal Graph for Image and Video Representation , 2018, IEEE Transactions on Image Processing.

[20]  Heng Tao Shen,et al.  Video Captioning With Attention-Based LSTM and Semantic Consistency , 2017, IEEE Transactions on Multimedia.

[21]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[22]  Robert P. W. Duin,et al.  Handwritten digit recognition by combined classifiers , 1998, Kybernetika.

[23]  Hai Jin,et al.  Content-Based Visual Landmark Search via Multimodal Hypergraph Learning , 2015, IEEE Transactions on Cybernetics.

[24]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[25]  Heng Tao Shen,et al.  Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[26]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[27]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[28]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.