Query-Adaptive Fusion for Multimodal Search

We conduct a broad survey of query-adaptive search strategies in a variety of application domains, where the internal retrieval mechanisms used for search are adapted in response to the anticipated needs for each individual query experienced by the system. While these query-adaptive approaches can range from meta-search over text collections to multimodal search over video databases, we propose that all such systems can be framed and discussed in the context of a single, unified framework. In our paper, we keep an eye towards the domain of video search, where search cues are available from a rich set of modalities, including textual speech transcripts, low-level visual features, and high-level semantic concept detectors. The relative efficacy of each of the modalities is highly variant between many types of queries. We observe that the state of the art in query-adaptive retrieval frameworks for video collections is highly dependent upon the definition of classes of queries, which are groups of queries that share similar optimal search strategies, while many applications in text and Web retrieval have included many advanced strategies, such as direct prediction of search method performance and inclusion of contextual cues from the searcher. We conclude that such advanced strategies previously developed for text retrieval have a broad range of possible applications in future research in multimodal video search.

[1]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[2]  Tao Qin,et al.  Supervised rank aggregation , 2007, WWW '07.

[3]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[4]  Shih-Fu Chang,et al.  Using Relevance Feedback in Content-Based Image Metasearch , 1998, IEEE Internet Comput..

[5]  Ronan Cummins,et al.  An Axiomatic Study of Learned Term-Weighting Schemes , 2007 .

[6]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[7]  Myron Flickner,et al.  Query by Image and Video Content , 1995 .

[8]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[9]  Hinrich Schütze,et al.  Personalized search , 2002, CACM.

[10]  Shih-Fu Chang,et al.  Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[11]  William P. Birmingham,et al.  Architecture of a metasearch engine that supports user information needs , 1999, CIKM '99.

[12]  Shih-Fu Chang,et al.  A reranking approach for context-based concept fusion in video indexing and retrieval , 2007, CIVR '07.

[13]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[14]  Shih-Fu Chang,et al.  MetaSEEk: a content-based metasearch engine for images , 1997, Electronic Imaging.

[15]  Mor Naaman,et al.  Zurfer: mobile multimedia access in spatial, social and topical context , 2007, ACM Multimedia.

[16]  C. Lee Giles,et al.  Context and Page Analysis for Improved Web Search , 1998, IEEE Internet Comput..

[17]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[18]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[19]  Klaus Obermayer,et al.  Efficient Query Delegation by Detecting Redundant Retrieval Strategies , 2007 .

[20]  Amarnath Gupta,et al.  Virage image search engine: an open framework for image management , 1996, Electronic Imaging.

[21]  Wei-Hao Lin,et al.  Confounded Expectations: Informedia at TRECVID 2004 , 2004, TRECVID.

[22]  Wei-Ying Ma,et al.  Probabilistic model for contextual retrieval , 2004, SIGIR '04.

[23]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[24]  Wei Dai,et al.  Joint categorization of queries and clips for web-based video search , 2006, MIR '06.

[25]  Steve Lawrence,et al.  Context in Web Search , 2000, IEEE Data Eng. Bull..

[26]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[27]  Gang Wang,et al.  TRECVID 2004 Search and Feature Extraction Task by NUS PRIS , 2004, TRECVID.

[28]  Tie-Yan Liu,et al.  Learning to rank for information retrieval (LR4IR 2007) , 2007, SIGF.

[29]  Oren Etzioni,et al.  The MetaCrawler architecture for resource aggregation on the Web , 1997 .

[30]  Tobun Dorbin Ng,et al.  Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video , 2003, TRECVID.

[31]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[32]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[33]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[34]  Chun Chen,et al.  Automatic Query Type Classification for Web Image Retrieval , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[35]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[36]  Rong Yan,et al.  Multi-Lingual Broadcast News Retrieval , 2006, TRECVID.

[37]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[38]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[39]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[40]  Farzin Maghoul,et al.  Y!Q: contextual search at the point of inspiration , 2005, CIKM '05.

[41]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[42]  W. Bruce Croft Combining Approaches to Information Retrieval , 2002 .

[43]  Rong Yan,et al.  Probabilistic latent query analysis for combining multiple retrieval sources , 2006, SIGIR.

[44]  HenzingerMonika,et al.  Analysis of a very large web search engine query log , 1999 .

[45]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[46]  John R. Smith,et al.  Search and Progressive Image Retrieval from Distributed Image/Video Databases: The SPIRE Project , 1998, ECDL.

[47]  John Adcock,et al.  FXPAL Experiments for TRECVID 2004 , 2004, TRECVID.

[48]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[49]  Ellen M. Voorhees,et al.  Learning collection fusion strategies , 1995, SIGIR '95.

[50]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[51]  Alan F. Smeaton,et al.  A Comparison of Score, Rank and Probability-Based Fusion Methods for Video Shot Retrieval , 2005, CIVR.

[53]  Shai Fine,et al.  Metasearch and Federation using Query Difficulty Prediction , 2005 .

[54]  Ravi Kumar,et al.  Searching with context , 2006, WWW '06.

[55]  Rong Yan,et al.  Learning query-class dependent weights in automatic video retrieval , 2004, MULTIMEDIA '04.

[56]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[57]  Shih-Fu Chang,et al.  Visually Searching the Web for Content , 1997, IEEE Multim..

[58]  Reiner Kraft,et al.  Mining anchor text for query refinement , 2004, WWW '04.

[59]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[60]  Apostol Natsev,et al.  Dynamic Multimodal Fusion in Video Search , 2007, 2007 IEEE International Conference on Multimedia and Expo.