Learning query-class dependent weights in automatic video retrieval

Combining retrieval results from multiple modalities plays a crucial role for video retrieval systems, especially for automatic video retrieval systems without any user feedback and query expansion. However, most of current systems only utilize query independent combination or rely on explicit user weighting. In this work, we propose using query-class dependent weights within a hierarchial mixture-of-expert framework to combine multiple retrieval results. We first classify each user query into one of the four predefined categories and then aggregate the retrieval results with query-class associated weights, which can be learned from the development data efficiently and generalized to the unseen queries easily. Our experimental results demonstrate that the performance with query-class dependent weights can considerably surpass that with the query independent weights.

[1]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[2]  Eric Brill,et al.  Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[3]  Tobun Dorbin Ng,et al.  Informedia at TRECVID 2003 : Analyzing and Searching Broadcast News Video , 2003, TRECVID.

[4]  Shih-Fu Chang,et al.  MetaSEEk: a content-based metasearch engine for images , 1997, Electronic Imaging.

[5]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[6]  Alan F. Smeaton,et al.  Design, implementation and testing of an interactive video retrieval system , 2003, MIR '03.

[7]  Rong Yan,et al.  Co-retrieval: A Boosted Reranking Approach for Video Retrieval , 2004, CIVR.

[8]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.

[9]  Mark T. Maybury,et al.  Broadcast news navigation using story segmentation , 1997, MULTIMEDIA '97.

[10]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[11]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[12]  Grace Hui Yang,et al.  VideoQA: question answering on news video , 2003, MULTIMEDIA '03.

[13]  Dan Roth,et al.  Learning Question Classifiers , 2002, COLING.

[14]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[15]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  John D. Lafferty,et al.  A Robust Parsing Algorithm for Link Grammars , 1995, IWPT.

[18]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[19]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[20]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.