Video search reranking through random walk over document-level context graph

Multimedia search over distributed sources often result in recurrent images or videos which are manifested beyond the textual modality. To exploit such contextual patterns and keep the simplicity of the keyword-based search, we propose novel reranking methods to leverage the recurrent patterns to improve the initial text search results. The approach, context reranking, is formulated as a random walk problem along the context graph, where video stories are nodes and the edges between them are weighted by multimodal contextual similarities. The random walk is biased with the preference towards stories with higher initial text search scores - a principled way to consider both initial text search results and their implicit contextual relationships. When evaluated on TRECVID 2005 video benchmark, the proposed approach can improve retrieval on the average up to 32% relative to the baseline text search method in terms of story-level Mean Average Precision. In the people-related queries, which usually have recurrent coverage across news sources, we can have up to 40% relative improvement. Most of all, the proposed method does not require any additional input from users (e.g., example images), or complex search models for special queries (e.g., named person search).

[1]  Shih-Fu Chang,et al.  COLUMBIA-IBM NEWS VIDEO STORY SEGMENTATION IN TRECVID 2004 , 2005 .

[2]  John Battelle The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture , 2005 .

[3]  Alan F. Smeaton,et al.  A Comparison of Score, Rank and Probability-Based Fusion Methods for Video Shot Retrieval , 2005, CIVR.

[4]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[5]  Bernard J. Jansen Review of The search: How google and its rivals rewrote the rules of business and transformed our culture John Battelle, Penguin Group , 2006 .

[6]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[7]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[8]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[9]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[10]  Shih-Fu Chang,et al.  Topic Tracking Across Broadcast News Videos with Visual Duplicates and Semantic Concepts , 2006, 2006 International Conference on Image Processing.

[11]  Shih-Fu Chang,et al.  Detecting image near-duplicate by stochastic attributed relational graph matching with learning , 2004, MULTIMEDIA '04.

[12]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.

[13]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[14]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[15]  Rong Yan,et al.  Multi-Lingual Broadcast News Retrieval , 2006, TRECVID.

[16]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[17]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[18]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[19]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[20]  Anind K. Dey,et al.  Understanding and Using Context , 2001, Personal and Ubiquitous Computing.

[21]  Alexander G. Hauptmann,et al.  Successful approaches in the TREC video retrieval evaluations , 2004, MULTIMEDIA '04.

[22]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .

[23]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[24]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[25]  Naftali Tishby,et al.  Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[26]  Christos Faloutsos,et al.  GCap: Graph-based Automatic Image Captioning , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[27]  Wei-Ying Ma,et al.  Imagerank : spectral techniques for structural analysis of image database , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[28]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[29]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Gang Wang,et al.  TRECVID 2004 Search and Feature Extraction Task by NUS PRIS , 2004, TRECVID.