Mining partially annotated images

In this paper, we study the problem of mining partially annotated images. We first define what the problem of mining partially annotated images is, and argue that in many real-world applications annotated images are typically partially annotated and thus that the problem of mining partially annotated images exists in many situations. We then propose an effective solution to this problem based on a statistical model we have developed called the Semi-Supervised Correspondence Hierarchical Dirichlet Process (SSCHDP). The main idea of this model lies in exploiting the information pertaining to partially annotated images or even unannotated images to achieve semi-supervised learning under the HDP structure. We apply this model to completing the annotations appropriately for partially annotated images in the training data and then to predicting the annotations appropriately and completely for all the unannotated images either in the training data or in any unseen data beyond the training process. Experiments show that SSC-HDP is superior to the peer models from the recent literature when they are applied to solving the problem of mining partially annotated images.

[1]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Wei-Ying Ma,et al.  A probabilistic semantic model for image annotation and multi-modal image retrieval , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[4]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[5]  Vasant G Honavar,et al.  Annotating images and image objects using a hierarchical dirichlet process model , 2008, MDM '08.

[6]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[7]  Maosong Sun,et al.  Semi-supervised Learning for Image Annotation Based on Conditional Random Fields , 2006, CIVR.

[8]  Dan Klein,et al.  The Infinite PCFG Using Hierarchical Dirichlet Processes , 2007, EMNLP.

[9]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[10]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[11]  Ali Farhadi,et al.  Unlabeled data improvesword prediction , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[13]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[14]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[15]  Christos Faloutsos,et al.  Enhanced max margin learning on multimodal data mining in a multimedia database , 2007, KDD '07.

[16]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[17]  Wei-Ying Ma,et al.  A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva , 2005, ICCV.

[18]  David G. Luenberger,et al.  Linear and nonlinear programming , 1984 .

[19]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[20]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[21]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[22]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[23]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[24]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[25]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[26]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[27]  Ali Farhadi,et al.  Unlabeled Data Improves Word Prediction , 2009 .

[28]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[29]  Jing Liu,et al.  Image annotation using multi-correlation probabilistic matrix factorization , 2010, ACM Multimedia.

[30]  Shuang-Hong Yang,et al.  Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora , 2009, NIPS.

[31]  Gang Hua,et al.  Meta-tag propagation by co-training an ensemble classifier for improving image search relevance , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[32]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.