Supervised graph regularization based cross media retrieval with intra and inter-class correlation

Abstract With the rapid development of internet technology, mining and retrieving the information from internet accurately is an urgent problem, among which, cross media retrieval becomes a hot spot of current research. This paper proposes a cross media retrieval approach, which learns two couples of projections based on different retrieval tasks. We first learn a common subspace to project heterogeneous media data to the isomorphic subspace, to measure the similarity of the heterogeneous media data in the isomorphic subspace. Second, we build isomorphic and heterogeneous adjacent graphs to preserve the correlations of the cross media data. Then we combine the two processes together to learn a common subspace. We also consider intra-class and inter-class similarity of images or texts in the unified framework. Third, the L 2 norm is used to perform feature selection for different media data. Experimental results on three datasets demonstrate the effectiveness of the proposed approach.

[1]  Li Wang,et al.  Multi-class joint subspace learning for cross-modal retrieval , 2020, Pattern Recognit. Lett..

[2]  Qi Tian,et al.  Cross-Modal Retrieval Using Multiordered Discriminative Structured Subspace Learning , 2017, IEEE Transactions on Multimedia.

[3]  Yao Zhao,et al.  Cross-Modal Retrieval With CNN Visual Features: A New Baseline , 2017, IEEE Transactions on Cybernetics.

[4]  Lei Zhu,et al.  Learning Compact Visual Representation with Canonical Views for Robust Mobile Landmark Search , 2016, IJCAI.

[5]  Yueting Zhuang,et al.  Learning Semantic Correlations for Cross-Media Retrieval , 2006, 2006 International Conference on Image Processing.

[6]  Meng Wang,et al.  A Framework of Joint Low-Rank and Sparse Regression for Image Memorability Prediction , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Lei Zhu,et al.  Task-Dependent and Query-Dependent Subspace Learning for Cross-Modal Retrieval , 2018, IEEE Access.

[8]  Li Wang,et al.  Joint feature selection and graph regularization for modality-dependent cross-modal retrieval , 2018, J. Vis. Commun. Image Represent..

[9]  Jing Lu,et al.  Creating ensembles of classifiers via fuzzy clustering and deflection , 2010, Fuzzy Sets Syst..

[10]  Li Wang,et al.  Coupled feature selection based semi-supervised modality-dependent cross-modal retrieval , 2019, Multimedia Tools and Applications.

[11]  Qi Tian,et al.  Generalized Semi-supervised and Structured Subspace Learning for Cross-Modal Retrieval , 2018, IEEE Transactions on Multimedia.

[12]  Shuang Gao,et al.  A locality correlation preserving support vector machine , 2014, Pattern Recognit..

[13]  Sham M. Kakade,et al.  Multi-view Regression Via Canonical Correlation Analysis , 2007, COLT.

[14]  Huaxiang Zhang,et al.  A spectral clustering based ensemble pruning approach , 2014, Neurocomputing.

[15]  Jing Lu,et al.  Semi-supervised fuzzy clustering: A kernel-based approach , 2009, Knowl. Based Syst..

[16]  Yao Zhao,et al.  Modality-Dependent Cross-Media Retrieval , 2015, ACM Trans. Intell. Syst. Technol..

[17]  Xiaohua Zhai,et al.  Learning Cross-Media Joint Representation With Sparse and Semisupervised Regularization , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Luming Zhang,et al.  Multiview Physician-Specific Attributes Fusion for Health Seeking , 2017, IEEE Transactions on Cybernetics.

[19]  Li Liu,et al.  A Cross-Media Retrieval Algorithm Based on Consistency Preserving of Collaborative Representation , 2018, J. Adv. Comput. Intell. Intell. Informatics.

[20]  Meng Wang,et al.  Low-Rank Multi-View Embedding Learning for Micro-Video Popularity Prediction , 2018, IEEE Transactions on Knowledge and Data Engineering.

[21]  Zi Huang,et al.  Discrete Multimodal Hashing With Canonical Views for Robust Mobile Landmark Search , 2017, IEEE Transactions on Multimedia.

[22]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[23]  Qi Tian,et al.  PL-ranking: A Novel Ranking Method for Cross-Modal Retrieval , 2016, ACM Multimedia.

[24]  Ishwar K. Sethi,et al.  Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.

[25]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[26]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[27]  Nikhil Rasiwasia,et al.  Cluster Canonical Correlation Analysis , 2014, AISTATS.

[28]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[29]  Xiaohua Zhai,et al.  Semi-Supervised Cross-Media Feature Learning With Unified Patch Graph Regularization , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Qiang Wu,et al.  Support vector regression for multi-view gait recognition based on local motion feature selection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[32]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Hongbin Zha,et al.  Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval , 2017, SIGIR.

[34]  Michael Isard,et al.  A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.

[35]  Tieniu Tan,et al.  Joint Feature Selection and Subspace Learning for Cross-Modal Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[37]  Xuelong Li,et al.  Modeling Disease Progression via Multisource Multitask Learners: A Case Study With Alzheimer’s Disease , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Qiang Wang,et al.  Joint graph regularization based modality-dependent cross-media retrieval , 2018, Multimedia Tools and Applications.

[39]  Yuxin Peng,et al.  CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network , 2017, IEEE Transactions on Multimedia.

[40]  David S. Rosenblum,et al.  From action to activity: Sensor-based activity recognition , 2016, Neurocomputing.

[41]  Xiaohua Zhai,et al.  Cross-media retrieval by intra-media and inter-media correlation mining , 2013, Multimedia Systems.