Video Retrieval Based on Semantic Concepts

An approach using many intermediate semantic concepts is proposed with the potential to bridge the semantic gap between what a color, shape, and texture-based ldquolow-levelrdquo image analysis can extract from video and what users really want to find, most likely using text descriptions of their information needs. Semantic concepts such as cars, planes, roads, people, animals, and different types of scenes (outdoor, night time, etc.) can be automatically detected in the video with reasonable accuracy. This leads us to ask how can they be used automatically and how does a user (or a retrieval system) translate the user's information need into a selection of related concepts that would help find the relevant video clips, from the large list of available concepts. We illustrate how semantic concept retrieval can be automatically exploited by mapping queries into query classes and through pseudo-relevance feedback. We also provide evidence how a semantic concept can be utilized by users in interactive retrieval, through interfaces that provide affordances of explicit concept selection and search, concept filtering, and relevance feedback. How many concepts we actually need and how accurately they need to be detected and linked through various relationships is specified in the ontology structure.

[1]  Paul Over,et al.  TRECVID 2006 Overview , 2006, TRECVID.

[2]  Shih-Fu Chang,et al.  Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[3]  Jakob Nielsen,et al.  Heuristic Evaluation of Prototypes (individual) , 2022 .

[4]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[5]  Djoerd Hiemstra,et al.  Interactive Content-Based Retrieval Using Pre-computed Object-Object Similarities , 2004, CIVR.

[6]  Arden Alexander,et al.  The Thesaurus for Graphic Materials: Its History, Use, and Future , 2001 .

[7]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[8]  Dan I. Moldovan,et al.  Exploiting ontologies for automatic image annotation , 2005, SIGIR '05.

[9]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[10]  Sara Shatford,et al.  Analyzing the Subject of a Picture: A Theoretical Approach , 1986 .

[11]  Milind R. Naphade,et al.  Assessing the Filtering and Browsing Utility of Automatic Semantic Concepts for Multimedia Retrieval , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[12]  Eero Sormunen,et al.  End-User Searching Challenges Indexing Practices in the Digital Newspaper Photo Archive , 2004, Information Retrieval.

[13]  Ben Shneiderman,et al.  Clarifying Search: A User-Interface Framework for Text Searches , 1997, D Lib Mag..

[14]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[15]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[17]  Liang-Tien Chia,et al.  Does ontology help in image retrieval?: a comparison between keyword, text ontology and multi-modality ontology approaches , 2006, MM '06.

[18]  Arnold Neumaier,et al.  Global Optimization by Multilevel Coordinate Search , 1999, J. Glob. Optim..

[19]  Henry Schneiderman,et al.  Learning Statistical Structure for Object Detection , 2003, CAIP.

[20]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[21]  Rong Yan,et al.  Probabilistic models for combining diverse knowledge sources in multimedia retrieval , 2006 .

[22]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[23]  Rong Yan,et al.  Merging storyboard strategies and automatic retrieval for improving interactive video search , 2007, CIVR '07.

[24]  Marcel Worring,et al.  Learned Lexicon-Driven Interactive Video Retrieval , 2006, CIVR.

[25]  Gary Marchionini,et al.  The relative effectiveness of concept-based versus content-based video retrieval , 2004, MULTIMEDIA '04.

[26]  Marcel Worring,et al.  Assessing User Behaviour in News Video Retrieval , 2005 .

[27]  Jun Yang,et al.  CMU Informedia's TRECVID 2005 Skirmishes , 2005, TRECVID.

[28]  Tobun Dorbin Ng,et al.  Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video , 2003, TRECVID.

[29]  Toni Petersen,et al.  Guide to indexing and cataloging with the Art & architecture thesaurus , 1994 .

[30]  Edward Y. Chang,et al.  Optimal multimodal fusion for multimedia data analysis , 2004, MULTIMEDIA '04.

[31]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[32]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[33]  Yihong Gong,et al.  Lessons Learned from Building a Terabyte Digital Video Library , 1999, Computer.

[34]  Paul Over,et al.  The TREC2001 Video Track: Information Retrieval on Digital Video Information , 2002, ECDL.

[35]  John R. Smith,et al.  Interactive content-based retrieval of video , 2002, Proceedings. International Conference on Image Processing.

[36]  Wei-Hao Lin,et al.  News video classification using SVM-based multimodal classifiers and combination strategies , 2002, MULTIMEDIA '02.

[37]  Ben Taskar,et al.  Learning on the Test Data: Leveraging Unseen Features , 2003, ICML.

[38]  Paul Over,et al.  TRECVID 2005 - An Overview , 2005, TRECVID.

[39]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[40]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[41]  Tao Tao,et al.  Regularized estimation of mixture models for robust pseudo-relevance feedback , 2006, SIGIR.

[42]  Alexander G. Hauptmann,et al.  The Use and Utility of High-Level Semantic Features in Video Retrieval , 2005, CIVR.

[43]  Rong Yan,et al.  Extreme video retrieval: joint maximization of human and computer performance , 2006, MM '06.

[44]  Jun Yang,et al.  Finding Person X: Correlating Names with Visual Appearances , 2004, CIVR.

[45]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Michael G. Christel,et al.  Addressing the challenge of visual information access from digital image and video libraries , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[47]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[48]  Kerry Rodden,et al.  Does organisation by similarity assist image browsing? , 2001, CHI.

[49]  Marcel Worring,et al.  A Learned Lexicon-Driven Paradigm for Interactive Video Retrieval , 2007, IEEE Transactions on Multimedia.

[50]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[51]  Jun Yang,et al.  Naming every individual in news video monologues , 2004, MULTIMEDIA '04.

[52]  ChengXiang Zhai,et al.  Probabilistic Relevance Models Based on Document and Query Generation , 2003 .

[53]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[54]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[55]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[56]  Michael G. Christel,et al.  Finding the right shots: assessing usability and performance of a digital video library interface , 2004, MULTIMEDIA '04.

[57]  Gang Wang,et al.  TRECVID 2004 Search and Feature Extraction Task by NUS PRIS , 2004, TRECVID.

[58]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[59]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[60]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[61]  Tat-Seng Chua,et al.  TRECVID 2005 by NUS PRIS , 2005, TRECVID.

[62]  Shih-Fu Chang,et al.  Combining text and audio-visual features in video indexing , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[63]  Rong Yan,et al.  Learning query-class dependent weights in automatic video retrieval , 2004, MULTIMEDIA '04.

[64]  Douglas B. Lenat,et al.  Mapping Ontologies into Cyc , 2002 .

[65]  Ophir Frieder,et al.  Surrogate scoring for improved metasearch precision , 2005, SIGIR '05.

[66]  Michael G. Christel,et al.  Mining Novice User Activity with TRECVID Interactive Retrieval Tasks , 2006, CIVR.

[67]  Carsten Peterson,et al.  A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..