Efficient Tag Mining via Mixture Modeling for Real-Time Search-Based Image Annotation

Although it has been extensively studied for many years, automatic image annotation is still a challenging problem. Recently, data-driven approaches have demonstrated their great success to image auto-annotation. Such approaches leverage abundant partially annotated web images to annotate an uncaptioned image. Specifically, they first retrieve a group of visually closely similar images given an uncaptioned image as a query, then figure out meaningful phrases from the surrounding texts of the image search results. Since the surrounding texts are generally noisy, how to effectively mine meaningful phrases is crucial for the success of such approaches. We propose a mixture modeling approach which assumes that a tag is generated from a convex combination of topics. Different from a typical topic modeling approach like LDA, topics in our approach are explicitly learnt from a definitive catalog of the Web, i.e. the Open Directory Project (ODP). Compared with previous works, it has two advantages: Firstly, it uses an open vocabulary rather than a limited one defined by a training set. Secondly, it is efficient for real-time annotation. Experimental results conducted on two billion web images show the efficiency and effectiveness of the proposed approach.

[1]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Yi Li,et al.  ARISTA - image search to annotation on billions of web photos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[7]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[10]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[11]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[12]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.

[13]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[14]  Víctor Pàmies,et al.  Open Directory Project , 2003 .