论文信息 - Multi-Label Learning With Fused Multimodal Bi-Relational Graph

Multi-Label Learning With Fused Multimodal Bi-Relational Graph

The problem of multi-label image classification using multiple feature modalities is considered in this work. Given a collection of images with partial labels, we first model the association between different feature modalities and the images labels. These associations are then propagated with a graph diffusion kernel to classify the unlabeled images. Towards this objective, a novel Fused Multimodal Bi-relational Graph representation is proposed, with multiple graphs corresponding to different feature modalities, and one graph corresponding to the image labels. Such a representation allows for effective exploitation of both feature complementariness and label correlation. This contrasts with previous work where these two factors are considered in isolation. Furthermore, we provide a solution to learn the weight for each image graph by estimating the discriminative power of the corresponding feature modality. Experimental results with our proposed method on two standard multi-label image datasets are very promising.

[1] Chris H. Q. Ding,et al. Discriminant Laplacian Embedding , 2010, AAAI.

[2] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[3] Mikhail Belkin,et al. Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[4] Yi Yang,et al. A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Chris H. Q. Ding,et al. Image annotation using bi-relational graph of images and semantic labels , 2011, CVPR 2011.

[6] Jiebo Luo,et al. Learning multi-label scene classification , 2004, Pattern Recognit..

[7] Meng Wang,et al. Optimizing multi-graph learning: towards a unified video annotation scheme , 2007, ACM Multimedia.

[8] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[9] Jieping Ye,et al. A shared-subspace learning framework for multi-label classification , 2010, TKDD.

[10] Yihong Gong,et al. Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[11] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[12] Bernhard Schölkopf,et al. Learning with Local and Global Consistency , 2003, NIPS.

[13] Mark J. Huiskes,et al. The MIR flickr retrieval evaluation , 2008, MIR '08.

[14] Bernhard Schölkopf,et al. Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[15] Gang Chen,et al. Semi-supervised Multi-label Learning by Solving a Sylvester Equation , 2008, SDM.

[16] Dieter Fox,et al. Object recognition with hierarchical kernel descriptors , 2011, CVPR 2011.

[17] Shuicheng Yan,et al. Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[18] Rong Jin,et al. Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19] Yi Liu,et al. Semi-supervised Multi-label Learning by Constrained Non-negative Matrix Factorization , 2006, AAAI.

[20] Cordelia Schmid,et al. Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21] Sebastian Thrun,et al. Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[22] Chong-Wah Ngo,et al. A revisit of Generative Model for Automatic Image Annotation using Markov Random Fields , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Chris H. Q. Ding,et al. Image annotation using multi-label correlated Green's function , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24] Zhiwen Yu,et al. Transductive multi-label ensemble classification for protein function prediction , 2012, KDD.

[25] Dieter Fox,et al. Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26] Tao Mei,et al. Graph-based semi-supervised learning with multiple labels , 2009, J. Vis. Commun. Image Represent..

[27] Gabriela Csurka,et al. Semantic combination of textual and visual information in multimedia retrieval , 2011, ICMR.

[28] Andrew Zisserman,et al. Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29] Tat-Seng Chua,et al. Image Annotation by Graph-Based Inference With Integrated Multiple/Single Instance Representations , 2010, IEEE Transactions on Multimedia.

[30] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[31] Volker Tresp,et al. Multi-label informed latent semantic indexing , 2005, SIGIR '05.

[32] Zhiwu Lu,et al. Multi-modal constraint propagation for heterogeneous image clustering , 2011, ACM Multimedia.

[33] Meng Wang,et al. Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[34] B. S. Manjunath,et al. Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[35] Jason Weston,et al. A kernel method for multi-labelled classification , 2001, NIPS.

[36] Meng Wang,et al. Correlative Linear Neighborhood Propagation for Video Annotation , 2009, IEEE Trans. Syst. Man Cybern. Part B.

[37] ZissermanAndrew,et al. The Pascal Visual Object Classes Challenge , 2015 .

[38] B. S. Manjunath,et al. Video Annotation Through Search and Graph Reinforcement Mining , 2010, IEEE Transactions on Multimedia.

[39] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[40] Hung-Khoon Tan,et al. Fusing heterogeneous modalities for video and image re-ranking , 2011, ICMR '11.

[41] Changhu Wang,et al. Image annotation refinement using random walk with restarts , 2006, MM '06.

[42] Cordelia Schmid,et al. TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[43] Weidong Yang,et al. Labeling Images by Integrating Sparse Multiple Distance Learning and Semantic Context Modeling , 2012, ECCV.

[44] Manik Varma,et al. Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[45] Sunita Sarawagi,et al. Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[46] Jingrui He,et al. Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[47] Martial Hebert,et al. Discriminative Fields for Modeling Spatial Dependencies in Natural Images , 2003, NIPS.

[48] Xian-Sheng Hua,et al. Transductive multi-label learning for video concept detection , 2008, MIR '08.

[49] Tao Mei,et al. Correlative multi-label video annotation , 2007, ACM Multimedia.