Using score distributions for query-time fusion in multimediaretrieval

In this paper we present the results of our work on the analysis of multi-modal data for video Information Retrieval, where we exploit the properties of this data for query-time, automatic generation of weights for multi-modal data fusion. Through empirical testing we have observed that for a given topic, a high performing feature,that is one which achieves high relevance, will have a different distribution of document scores when compared against those that do not perform as well. These observations form the basis for our initial fusion model, which generates weights based on these properties, without the need for prior training.Our model can be used to not only combine feature data,but to also combine the results of multiple example query images and apply weights to these.Our analysis and experiments were conducted on the TRECVid 2004 and 2005 collections,making use of multiple MPEG-7 low-level features and automatic speech recognition (ASR)transcripts.Results achieved from our model achieve performance on a par with that of 'oracle' determined weights,and demonstrate the applicability of our model whilst advancing the case for further investigation of score distributions.

[1]  Alan F. Smeaton,et al.  Large Scale Evaluations of Multimedia Information Retrieval: The TRECVid Experience , 2005, CIVR.

[2]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[3]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[4]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[5]  R. Manmatha,et al.  Modeling score distributions for combining the outputs of search engines , 2001, SIGIR '01.

[6]  Alan F. Smeaton,et al.  TRECVid 2006 Experiments at Dublin City University , 2012, TRECVID.

[7]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[8]  Iadh Ounis,et al.  Dempster-Shafer Theory for a Query-Biased Combination of Evidence on the Web , 2005, Information Retrieval.

[9]  W. Bruce Croft,et al.  Evaluation of an inference network-based retrieval model , 1991, TOIS.

[10]  Alan F. Smeaton,et al.  Automatic Determination of Feature Weights for Multi-feature CBIR , 2006, ECIR.

[11]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[12]  Alan F. Smeaton,et al.  TRECVID 2004 Experiments in Dublin City University , 2004, TRECVID.

[13]  Gerald Salton,et al.  Automatic text processing , 1988 .

[14]  Arthur P. Dempster,et al.  A Generalization of Bayesian Inference , 1968, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[15]  Tom Minka,et al.  Modeling user subjectivity in image libraries , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[16]  Toshikazu Kato,et al.  Learning of personal visual impression for image database systems , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[17]  Mounia Lalmas,et al.  Video retrieval using an MPEG-7 based inference network , 2002, SIGIR '02.

[18]  Joemon M. Jose,et al.  Spatial querying for image retrieval: a user-oriented evaluation , 1998, SIGIR '98.

[19]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[20]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[21]  Garrison W. Cottrell,et al.  Automatic combination of multiple ranked retrieval systems , 1994, SIGIR '94.

[22]  Alan F. Smeaton,et al.  Físréal: A Low Cost Terabyte Search Engine , 2005, ECIR.