Annotating Images by Mining Image Search

Although it has been studied for years by computer vision and machine learning communities, image annotation is still far from practical. In this chapter, the authors propose a novel attempt of modeless image annotation, which investigates how effective a data-driven approach can be, and suggest annotating an uncaptioned image by mining its search results. The authors collected 2.4 million images with their surrounding texts from a few photo forum Web sites as our database to support this data-driven approach. The entire process contains three steps: (1) the search process to discover visually and semantically similar search results; (2) the mining process to discover salient terms from textual descriptions of the search results; and (3) the annotation rejection process to filter noisy terms yielded by step 2. To ensure real time annotation, two key techniques are leveraged – one is to map the high dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training dataset is required, our proposed approach enables annotating with unlimited vocabulary, and is highly scalable and robust to outliers. Experimental results on real Web images show the effectiveness and efficiency of the proposed algorithm. Xin-Jing Wang Microsoft Research Asia, China Lei Zhang Microsoft Research Asia, China Xirong Li Microsoft Research Asia, China Wei-Ying Ma Microsoft Research Asia, China Annotating Images by Mining Image Search DOI: 10.4018/978-1-60960-818-7.ch4.17

[1]  Wei Feng,et al.  Delay-Range-Dependent Robust Stability for Uncertain Stochastic Neural Networks with Time-Varying Delays , 2010, Int. J. Softw. Sci. Comput. Intell..

[2]  Yingxu Wang,et al.  The Formal Design Models of a Set of Abstract Data Types (ADTs) , 2010, Int. J. Softw. Sci. Comput. Intell..

[3]  Haibin Zhu,et al.  A Least-Laxity-First Scheduling Algorithm of Variable Time Slice for Periodic Tasks , 2010, Int. J. Softw. Sci. Comput. Intell..

[4]  John S. Erickson Database Technologies: Concepts, Methodologies, Tools, and Applications (4 Volumes) , 2009, Database Technologies: Concepts, Methodologies, Tools, and Applications.

[5]  Andrew Gemino,et al.  Use Case Diagrams in Support of Use Case Modeling: Deriving Understanding from the Picture , 2009, J. Database Manag..

[6]  Vania Bogorny,et al.  Enhancing the Process of Knowledge Discovery in Geographic Databases Using Geo-Ontologies , 2009, Database Technologies: Concepts, Methodologies, Tools, and Applications.

[7]  Stefano Rizzi,et al.  Conceptual Modeling Solutions for the Data Warehouse , 2009, Database Technologies: Concepts, Methodologies, Tools, and Applications.

[8]  Terry Halpin,et al.  Selected Readings on Database Technologies and Applications , 2008 .

[9]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Dipak Laha,et al.  Handbook of Computational Intelligence in Manufacturing and Production Management , 2007 .

[11]  Xirong Li,et al.  SBIA: search-based image annotation by leveraging web-scale images , 2007, ACM Multimedia.

[12]  Cordelia Schmid,et al.  Learning Color Names from Real-World Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Bin Wang,et al.  Large-Scale Duplicate Detection for Web Image Search , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[15]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[16]  Xing Xie,et al.  Photo-to-search: using multimodal queries to search the web from mobile devices , 2005, MIR '05.

[17]  John R. Smith,et al.  A web-based system for collaborative annotation of large image and video collections: an evaluation and user study , 2005, MULTIMEDIA '05.

[18]  Ryoji Kataoka,et al.  A search result clustering method using informatively named entities , 2005, WIDM '05.

[19]  Wei-Ying Ma,et al.  A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva , 2005, ICCV.

[20]  Gustavo Carneiro,et al.  A database centric view of semantic image annotation and retrieval , 2005, SIGIR '05.

[21]  Sanjeev Khudanpur,et al.  Hidden Markov models for automatic annotation and content-based retrieval of images and video , 2005, SIGIR '05.

[22]  Joshua R. Smith,et al.  A Web-based System for Collaborative Annotation of Large Image and Video Collections , 2005 .

[23]  Wei-Ying Ma,et al.  Multi-model similarity propagation and its application for web image retrieval , 2004, MULTIMEDIA '04.

[24]  Daniel Gatica-Perez,et al.  PLSA-based image auto-annotation: constraining the latent space , 2004, MULTIMEDIA '04.

[25]  Jianping Fan,et al.  Multi-level annotation of natural scenes using dominant image components and semantic concepts , 2004, MULTIMEDIA '04.

[26]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[27]  Christos Faloutsos,et al.  Automatic multimedia cross-modal correlation discovery , 2004, KDD.

[28]  Jianping Fan,et al.  Automatic image annotation by using concept-sensitive salient objects for image content representation , 2004, SIGIR '04.

[29]  Wei-Ying Ma,et al.  Learning to cluster web search results , 2004, SIGIR '04.

[30]  Christos Faloutsos,et al.  GCap: Graph-based Automatic Image Captioning , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[31]  Konrad Tollmar,et al.  Searching the Web with mobile images for location recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[32]  J. Jeon,et al.  Automatic Image Annotation of News Images with Large Vocabularies and Low Quality Training Data , 2004 .

[33]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[34]  Daniel Gatica-Perez,et al.  On image auto-annotation with latent space models , 2003, ACM Multimedia.

[35]  Edward Y. Chang,et al.  Confidence-based dynamic ensemble for image annotation and semantics discovery , 2003, MULTIMEDIA '03.

[36]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[38]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[39]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[40]  Kobus Barnard,et al.  Recognition as Translating Images into Text , 2003, IS&T/SPIE Electronic Imaging.

[41]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[42]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[44]  Mingjing Li,et al.  iFind: a web image search engine , 2001, SIGIR '01.

[45]  Mary Czerwinski,et al.  Semi-Automatic Image Annotation , 2001, INTERACT.

[46]  Lei Zhu,et al.  Keyblock: an approach for content-based image retrieval , 2000, ACM Multimedia.

[47]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[48]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[49]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[50]  Ole J. Anfindsen,et al.  Conditional Conflict Serializability: An Application Oriented Correctness Criterion , 1998, J. Database Manag..

[51]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  P. Yip,et al.  Discrete Cosine Transform: Algorithms, Advantages, Applications , 1990 .

[53]  I. Jolliffe Principal Component Analysis , 2005 .