A Multi-View Fusion Method via Tensor Learning and Gradient Descent for Image Features

In many computer vision applications, one image can be represented by multiple heterogeneous features from different views, most of them commonly locate in high-dimensional space. These features can reflect different characteristics of one same object, they contain compatible and complementary information among each other. How to construct an uniform low-dimensional embedding features which represent useful information of multi-view features is still an important and urgent issue to be solved. Therefore, we propose a multi-view fusion method via tensor learning and gradient descent (MvF-TG) in this paper. MvF-TG reconstructs a lowdimensional mapping subspace of each object by utilizing its k nearest neighbors, which preserves the underlying neighborhood structure of the original local manifold. The new method can effectively exploit the spatial correlation information from the multi-view features by tensor learning. Furthermore, the method constructs a gradient descent optimization model to generate the better unified low dimensional embedding. The proposed method is compared with several single-view and multi-view dimensional reduction methods in these indicators of P, R, MAP and F-measure. In the retrieval experiments, the P values of the newmethod respectively are 86.80%, 52.00%, 68.56% and 78.80% on datasets of Corel1k, Corel5k, Corel10k and Holidays. In the classification experiments, the mean accuracies of it respectively are 47.94% and 87.58% on datasets of Caltech101 and Coil. These values are higher than those obtained by other comparison methods, various evaluations based on the applications of image classification and retrieval demonstrates the effectiveness of our proposed method on multi-view feature fusion dimension reduction.

[1]  Jingjing Zhao,et al.  Multiview feature fusion optimization method for image retrieval based on matrix correlation , 2020, J. Electronic Imaging.

[2]  Tao Zhang,et al.  Fast and robust road sign detection in driver assistance systems , 2018, Applied Intelligence.

[3]  Eleni Stroulia,et al.  Chapter 6 – Latent Dirichlet Allocation: Extracting Topics from Software Engineering Data , 2015 .

[4]  Shenglan Liu,et al.  Multi-trend binary code descriptor: a novel local texture feature descriptor for image retrieval , 2017, Signal, Image and Video Processing.

[5]  Mingyue Ding,et al.  Laplacian Eigenmaps Network-Based Nonlocal Means Method for MR Image Denoising , 2019, Sensors.

[6]  Yongdong Zhang,et al.  Multiview Spectral Embedding , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Wengang Zhou,et al.  Weber’s law based multi-level convolution correlation features for image retrieval , 2021, Multimedia Tools and Applications.

[8]  Xianping Fu,et al.  Multi-Path Deep CNNs for Fine-Grained Car Recognition , 2020, IEEE Transactions on Vehicular Technology.

[9]  Yicong Zhou,et al.  Nonconvex multi-view subspace clustering via simultaneously learning the representation tensor and affinity matrix , 2020, Pattern Recognit..

[10]  Bhabani Shankar Prasad Mishra,et al.  Fusion of PHOG and LDP local descriptors for kernel-based ear biometric recognition , 2018, Multimedia Tools and Applications.

[11]  Shervan Fekri-Ershad,et al.  Bark texture classification using improved local ternary patterns and multilayer neural network , 2020, Expert Syst. Appl..

[12]  Huchuan Lu,et al.  Salient Object Detection With Lossless Feature Reflection and Weighted Structural Loss , 2019, IEEE Transactions on Image Processing.

[13]  Sattar Hashemi,et al.  To increase quality of feature reduction approaches based on processing input datasets , 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks.

[14]  Shervan Fekri Ershad,et al.  Content-based image retrieval based on combination of texture and colour information extracted in spatial and frequency domains , 2019, Electron. Libr..

[15]  Huchuan Lu,et al.  HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection , 2018, ArXiv.

[16]  Meng Zhao,et al.  A novel image retrieval method based on multi-trend structure descriptor , 2016, J. Vis. Commun. Image Represent..

[17]  Qi Tian,et al.  SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Bin Yang,et al.  A real-time image forensics scheme based on multi-domain learning , 2019, Journal of Real-Time Image Processing.

[19]  Chuang Liu,et al.  RGB-T Saliency Detection via Low-Rank Tensor Learning and Unified Collaborative Ranking , 2020, IEEE Signal Processing Letters.

[20]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[21]  Lin Feng,et al.  Multi-view Locality Low-rank Embedding for Dimension Reduction , 2019, Knowl. Based Syst..

[22]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Swati Nigam,et al.  Efficient facial expression recognition using histogram of oriented gradients in wavelet domain , 2018, Multimedia Tools and Applications.

[24]  Xuan Li,et al.  A Survey on Tensor Techniques and Applications in Machine Learning , 2019, IEEE Access.

[25]  Sriparna Saha,et al.  A Unified Multi-view Clustering Algorithm Using Multi-objective Optimization Coupled with Generative Model , 2020, ACM Trans. Knowl. Discov. Data.

[26]  Pan Wang,et al.  Adaptive Discriminative Deep Correlation Filter for Visual Object Tracking , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[27]  Xuan Bi,et al.  Individualized Multilayer Tensor Learning With an Application in Imaging Analysis , 2019, Journal of the American Statistical Association.

[28]  Rama Chellappa,et al.  Learning Common and Feature-Specific Patterns: A Novel Multiple-Sparse-Representation-Based Tracker , 2018, IEEE Transactions on Image Processing.

[29]  Wei Zhao,et al.  Multiview Concept Learning Via Deep Matrix Factorization , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Nasrollah Moghadam Charkari,et al.  Unsupervised representation learning based on the deep multi-view ensemble learning , 2019, Applied Intelligence.

[31]  Huibing Wang,et al.  Co-regularized Multi-view Sparse Reconstruction Embedding for Dimension Reduction , 2019, Neurocomputing.

[32]  Yang Wang,et al.  Kernelized Multiview Subspace Analysis By Self-Weighted Learning , 2019, IEEE Transactions on Multimedia.

[33]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[34]  Qirong Mao,et al.  A survey of micro-expression recognition , 2020, Image Vis. Comput..

[35]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[36]  Xuelong Li,et al.  A Biologically Inspired Appearance Model for Robust Visual Tracking , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Pong C. Yuen,et al.  Robust Visual Tracking via Basis Matching , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Jun Wu,et al.  Fusion framework for color image retrieval based on bag-of-words model and color local Haar binary patterns , 2016, J. Electronic Imaging.

[39]  Lu Zhang,et al.  Multi‐view frontal face image generation: A survey , 2020, Concurr. Comput. Pract. Exp..

[40]  Qirong Mao,et al.  Learning Hierarchical Emotion Context for Continuous Dimensional Emotion Recognition From Video Sequences , 2019, IEEE Access.

[41]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[42]  Shervan Fekri-Ershad,et al.  Pap smear classification using combination of global significant value, texture statistical features and time series features , 2019, Multimedia Tools and Applications.

[43]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[44]  Yuan Wan,et al.  Adaptive Similarity Embedding for Unsupervised Multi-View Feature Selection , 2021, IEEE Transactions on Knowledge and Data Engineering.

[45]  Di Huang,et al.  Local Binary Patterns and Its Application to Facial Image Analysis: A Survey , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[46]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[47]  Cheng Zeng,et al.  Multi-view Embedding with Adaptive Shared Output and Similarity for unsupervised feature selection , 2019, Knowl. Based Syst..

[48]  Subrahmanyam Murala,et al.  Joint histogram between color and local extrema patterns for object tracking , 2013, Electronic Imaging.