Transductive Multi-Instance Multi-Label learning algorithm with application to automatic image annotation

Automatic image annotation has emerged as an important research topic due to its potential application on both image understanding and web image search. Due to the inherent ambiguity of image-label mapping and the scarcity of training examples, the annotation task has become a challenge to systematically develop robust annotation models with better performance. From the perspective of machine learning, the annotation task fits both multi-instance and multi-label learning framework due to the fact that an image is usually described by multiple semantic labels (keywords) and these labels are often highly related to respective regions rather than the entire image. In this paper, we propose an improved Transductive Multi-Instance Multi-Label (TMIML) learning framework, which aims at taking full advantage of both labeled and unlabeled data to address the annotation problem. The experiments over the well known Corel 5000 data set demonstrate that the proposed method is beneficial in the image annotation task and outperforms most existing image annotation algorithms.

[1]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[3]  Y. Mori,et al.  Image-to-word transformation based on dividing and vector quantizing images with words , 1999 .

[4]  Wei-Ying Ma,et al.  Manifold-Ranking-Based Keyword Propagation for Image Retrieval , 2006, EURASIP J. Adv. Signal Process..

[5]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[7]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[8]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[9]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[10]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[11]  Yi Li,et al.  A generative/discriminative learning algorithm for image classification , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[13]  Bin Wang,et al.  A graph-based image annotation framework , 2008, Pattern Recognit. Lett..

[14]  Farshad Fotouhi,et al.  Region based image annotation through multiple-instance learning , 2005, MULTIMEDIA '05.

[15]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[17]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[18]  Sally A. Goldman,et al.  MISSL: multiple-instance semi-supervised learning , 2006, ICML.

[19]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[20]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[22]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[24]  Xiaojun Qi,et al.  Incorporating multiple SVMs for automatic image annotation , 2007, Pattern Recognit..

[25]  Raimondo Schettini,et al.  Image annotation using SVM , 2003, IS&T/SPIE Electronic Imaging.

[26]  Jingrui He,et al.  Generalized Manifold-Ranking-Based Image Retrieval , 2006, IEEE Transactions on Image Processing.

[27]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[28]  Xian-Sheng Hua,et al.  Typicality ranking via semi-supervised multiple-instance learning , 2007, ACM Multimedia.

[29]  Bin Wang,et al.  Dual cross-media relevance model for image annotation , 2007, ACM Multimedia.

[30]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[31]  Jiayu Tang,et al.  A Study of Quality Issues for Image Auto-Annotation With the Corel Dataset , 2007, IEEE Transactions on Circuits and Systems for Video Technology.