BM25 With Exponential IDF for Instance Search

This paper deals with a novel concept of an exponential IDF in the BM25 formulation and compares the search accuracy with that of the BM25 with the original IDF in a content-based video retrieval (CBVR) task. Our video retrieval method is based on a bag of keypoints (local visual features) and the exponential IDF estimates the keypoint importance weights more accurately than the original IDF. The exponential IDF is capable of suppressing the keypoints from frequently occurring background objects in videos, and we found that this effect is essential for achieving improved search accuracy in CBVR. Our proposed method is especially designed to tackle instance video search, one of the CBVR tasks, and we demonstrate its effectiveness in significantly enhancing the instance search accuracy using the TRECVID2012 video retrieval dataset.

[1]  Wei Liu,et al.  BUPT-MCPRL at TRECVID 2012 , 2010, TRECVID.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Stephen E. Robertson,et al.  On the history of evaluation in IR , 2008, J. Inf. Sci..

[4]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[5]  A. P. deVries Content and multimedia database management systems , 1999 .

[6]  Tao Mei,et al.  CrowdReranking: exploring multiple search engines for visual search reranking , 2009, SIGIR.

[7]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[8]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[11]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[12]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[13]  ChengXiang Zhai,et al.  When documents are very long, BM25 fails! , 2011, SIGIR.

[14]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[15]  Koen E. A. van de Sande,et al.  Color Descriptors for Object Category Recognition , 2008, CGIV/MCS.

[16]  Tao Tao,et al.  A formal study of information retrieval heuristics , 2004, SIGIR '04.

[17]  S. Robertson The probability ranking principle in IR , 1997 .

[18]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[19]  Roi Blanco,et al.  Extending BM25 with multiple query operators , 2012, SIGIR '12.

[20]  Christopher J. C. Burges,et al.  A machine learning approach for improved BM25 retrieval , 2009, CIKM.

[21]  Xian-Sheng Hua,et al.  Video search re-ranking via multi-graph propagation , 2007, ACM Multimedia.

[22]  Chong-Wah Ngo,et al.  VIREO @ TRECVID 2012: Searching with Topology, Recounting will Small Concepts, Learning with Free Examples , 2012, TRECVID.

[23]  Stephen E. Robertson,et al.  Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.

[24]  Djoerd Hiemstra,et al.  The uncertain representation ranking framework for concept-based video retrieval , 2012, Information Retrieval.

[25]  Duy-Dinh Le,et al.  National Institute of Informatics, Japan at TRECVID 2008 , 2008, TRECVID.

[26]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[27]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[28]  Rong Yan,et al.  A review of text and image retrieval approaches for broadcast news video , 2007, Information Retrieval.

[29]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[30]  Shin'ichi Satoh,et al.  Large vocabulary quantization for searching instances from videos , 2012, ICMR '12.

[31]  Zhang Wen,et al.  PKU_ICST at TRECVID 2018: Instance Search Task , 2013, TRECVID.

[32]  Michael R. Lyu,et al.  A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval , 2008, IEEE Transactions on Multimedia.

[33]  Thierry Pun,et al.  Content-based query of image databases: inspirations from text retrieval , 2000, Pattern Recognit. Lett..

[34]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[35]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[37]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[38]  Arjen P. de Vries,et al.  The Relationship between IR and Multimedia Databases , 1998, BCS-IRSG Annual Colloquium on IR Research.

[39]  Thijs Westerveld,et al.  A comparison of continuous vs. discrete image models for probabilistic image and video retrieval , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[40]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[41]  Duy-Dinh Le,et al.  National institute of informatics, japan at TRECVID 2007: BBC rushes summarization , 2007, TVS '07.