Bridging the Gap: Query by Semantic Example

A combination of query-by-visual-example (QBVE) and semantic retrieval (SR), denoted as query-by-semantic-example (QBSE), is proposed. Images are labeled with respect to a vocabulary of visual concepts, as is usual in SR. Each image is then represented by a vector, referred to as a semantic multinomial, of posterior concept probabilities. Retrieval is based on the query-by-example paradigm: the user provides a query image, for which 1) a semantic multinomial is computed and 2) matched to those in the database. QBSE is shown to have two main properties of interest, one mostly practical and the other philosophical. From a practical standpoint, because it inherits the generalization ability of SR inside the space of known visual concepts (referred to as the semantic space) but performs much better outside of it, QBSE produces retrieval systems that are more accurate than what was previously possible. Philosophically, because it allows a direct comparison of visual and semantic representations under a common query paradigm, QBSE enables the design of experiments that explicitly test the value of semantic representations for image retrieval. An implementation of QBSE under the minimum probability of error (MPE) retrieval framework, previously applied with success to both QBVE and SR, is proposed, and used to demonstrate the two properties. In particular, an extensive objective comparison of QBSE with QBVE is presented, showing that the former significantly outperforms the latter both inside and outside the semantic space. By carefully controlling the structure of the semantic space, it is also shown that this improvement can only be attributed to the semantic nature of the representation on which QBSE is based.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[7]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[8]  Yasushi Kiyoki,et al.  A metadatabase system for semantic image search by a mathematical model of meaning , 1994, SGMD.

[9]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[10]  Anil K. Jain,et al.  Image retrieval using color and shape , 1996, Pattern Recognit..

[11]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[12]  R. Manmatha,et al.  Syntactic characterization of appearance and its application to image retrieval , 1997, Electronic Imaging.

[13]  M.L. Miller,et al.  Hidden annotation in content based image retrieval , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[14]  David A. Forsyth,et al.  Body plans , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  N. Haering,et al.  Locating deciduous trees , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[16]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[17]  Peter Auer,et al.  On Learning From Multi-Instance Examples: Empirical Evaluation of a Theoretical Approach , 1997, ICML.

[18]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[19]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[20]  Joshua R. Smith,et al.  Image retrieval evaluation , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[21]  Anil K. Jain,et al.  On image classification: city images vs. landscapes , 1998, Pattern Recognit..

[22]  Anil K. Jain,et al.  On image classification: city vs. landscape , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[23]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[24]  Wei-Ying Ma,et al.  Information embedding based on user's relevance feedback for image retrieval , 1999, Optics East.

[25]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Nuno Vasconcelos,et al.  A unifying view of image similarity , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[27]  Nuno Vasconcelos,et al.  Learning Over Multiple Temporal Scales in Image Databases , 2000, ECCV.

[28]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[29]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[30]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[31]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[32]  Hugh E. Williams,et al.  Are two pictures better than one? , 2001, Proceedings 12th Australasian Database Conference. ADC 2001.

[33]  Murat Kunt,et al.  Content-based retrieval from image databases: current solutions and future directions , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[34]  Nuno Vasconcelos,et al.  Image indexing with mixture hierarchies , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[35]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[36]  Thierry Pun,et al.  The Truth about Corel - Evaluation in Image Retrieval , 2002, CIVR.

[37]  Lei Guo,et al.  A new image retrieval system supporting query by semantics and example , 2002, Proceedings. International Conference on Image Processing.

[38]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[39]  John R. Smith,et al.  Validity-weighted model vector-based retrieval of video , 2003, IS&T/SPIE Electronic Imaging.

[40]  Wenyin Liu,et al.  Joint semantics and feature based image retrieval using relevance feedback , 2003, IEEE Trans. Multim..

[41]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[42]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[43]  Wei-Ying Ma,et al.  Learning a semantic space from user's relevance feedback for image retrieval , 2003, IEEE Trans. Circuits Syst. Video Technol..

[44]  A. P. deVries,et al.  Experimental evaluation of a generative probabilistic image retrieval model on 'easy' data , 2003 .

[45]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[46]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[47]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[48]  Nando de Freitas,et al.  A Constrained Semi-supervised Learning Approach to Data Association , 2004, ECCV.

[49]  Nuno Vasconcelos,et al.  Minimum probability of error image retrieval , 2012, IEEE Transactions on Signal Processing.

[50]  Min Zhang,et al.  Automatic image annotation based-on model space , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[51]  Gustavo Carneiro,et al.  Formulating semantic image annotation as a supervised learning problem , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[52]  Marcel Worring,et al.  Learned Lexicon-Driven Interactive Video Retrieval , 2006, CIVR.

[53]  Ames StreetCambridge Digital Libraries: Meeting Place for High-level and Low-level Vision , 2007 .

[54]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.