Exploring multimedia in a keyword space

We address the problem of searching multimedia by semantic similarity in a keyword space. In contrast to previous research we represent multimedia content by a vector of keywords instead of a vector of low-level features. This vector of keywords can be obtained through user manual annotations or computed by an automatic annotation algorithm. In this setting, we studied the influence of two aspects of the search by semantic similarity process: (1) accuracy of user keywords versus automatic keywords and (2) functions to compute semantic similarity between keyword vectors of two multimedia documents. We consider these two aspects to be crucial in the design of a keyword space that can exploit social-media information and can enrich applications such as Flickr and YouTube. Experiments were performed on an image and a video dataset with a large number of keywords, with different similarity functions and with two annotation methods. Surprisingly, we found that multimedia semantic similarity with automatic keywords performs as good as or better than 95% accurate user keywords.

[1]  Nicu Sebe,et al.  Toward Improved Ranking Metrics , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[3]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4]  Alexander Hauptmann,et al.  How many high-level concepts will fill the semantic gap in video retrieval ? , 2007 .

[5]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Nuno Vasconcelos,et al.  Bridging the Gap: Query by Semantic Example , 2007, IEEE Transactions on Multimedia.

[7]  Shih-Fu Chang,et al.  Semantic visual templates: linking visual features to semantics , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[8]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[9]  Gustavo Carneiro,et al.  Formulating semantic image annotation as a supervised learning problem , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Stefan M. Rüger,et al.  Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation , 2005, CIVR.

[11]  Nuno Vasconcelos,et al.  Query by Semantic Example , 2006, CIVR.

[12]  Farshad Fotouhi,et al.  Semantic feedback for interactive image retrieval , 2005, MULTIMEDIA '05.

[13]  Marcel Worring,et al.  The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[15]  Guy Riddihough Trading Accuracy for Speed , 2006, Science.

[16]  Qiang Yang,et al.  A unified framework for semantics and feature based relevance feedback in image retrieval systems , 2000, ACM Multimedia.

[17]  Wei-Ying Ma,et al.  Learning a semantic space from user's relevance feedback for image retrieval , 2003, IEEE Trans. Circuits Syst. Video Technol..

[18]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[19]  Stefan M. Rüger,et al.  Fractional Distance Measures for Content-Based Image Retrieval , 2005, ECIR.

[20]  Stefan M. Rüger,et al.  Trading Precision for Speed: Localised Similarity Functions , 2005, CIVR.

[21]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tsuhan Chen,et al.  An active learning framework for content-based information retrieval , 2002, IEEE Trans. Multim..

[23]  Milind R. Naphade,et al.  Semantic Multimedia Retrieval using Lexical Query Expansion and Model-Based Reranking , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[24]  Stefan M. Rüger,et al.  Information-theoretic semantic multimedia indexing , 2007, CIVR '07.

[25]  Thomas S. Huang,et al.  Unifying Keywords and Visual Contents in Image Retrieval , 2002, IEEE Multim..

[26]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[27]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[28]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[29]  Alan F. Smeaton,et al.  Experiments on using semantic distances between words in image caption retrieval , 1996, SIGIR '96.

[30]  Stefan M. Rüger,et al.  High-dimensional visual vocabularies for image retrieval , 2007, SIGIR.

[31]  John R. Smith,et al.  Cluster-based data modeling for semantic video search , 2007, CIVR '07.