Bayesian video search reranking

Content-based video search reranking can be regarded as a process that uses visual content to recover the "true" ranking list from the noisy one generated based on textual information. This paper explicitly formulates this problem in the Bayesian framework, i.e., maximizing the ranking score consistency among visually similar video shots while minimizing the ranking distance, which represents the disagreement between the objective ranking list and the initial text-based. Different from existing point-wise ranking distance measures, which compute the distance in terms of the individual scores, two new methods are proposed in this paper to measure the ranking distance based on the disagreement in terms of pair-wise orders. Specifically, hinge distance penalizes the pairs with reversed order according to the degree of the reverse, while preference strength distance further considers the preference degree. By incorporating the proposed distances into the optimization objective, two reranking methods are developed which are solved using quadratic programming and matrix computation respectively. Evaluation on TRECVID video search benchmark shows that the performance improvement up to 21% on TRECVID 2006 and 61.11% on TRECVID 2007 are achieved relative to text search baseline.

[1]  Yin Zhang,et al.  Rank-Two Relaxation Heuristics for MAX-CUT and Other Binary Quadratic Programs , 2002, SIAM J. Optim..

[2]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[3]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[4]  Tao Mei,et al.  Learning to video search rerank via pseudo preference feedback , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[5]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[6]  A. Föhrenbach,et al.  SIMPLE++ , 2000, OR Spectr..

[7]  A. Lippman,et al.  A Bayesian video modeling framework for shot segmentation and content characterization , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[8]  Shih-Fu Chang,et al.  Reranking Methods for Visual Search , 2007, IEEE MultiMedia.

[9]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[10]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[11]  Stuart J. Russell,et al.  Object identification in a Bayesian context , 1997, IJCAI 1997.

[12]  Xian-Sheng Hua,et al.  Video search re-ranking via multi-graph propagation , 2007, ACM Multimedia.

[13]  Shih-Fu Chang,et al.  Video search reranking through random walk over document-level context graph , 2007, ACM Multimedia.

[14]  Meng Wang,et al.  MSRA-USTC-SJTU at TRECVID 2007: High-Level Feature Extraction and Search , 2007, TRECVID.

[15]  R. Murray,et al.  Stability analysis of interconnected nonlinear systems under matrix feedback , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[16]  Thore Graepel,et al.  Bayesian pattern ranking for move prediction in the game of Go , 2006, ICML.

[17]  Sayan Mukherjee,et al.  Feature reduction and hierarchy of classifiers for fast object detection in video images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  Rong Yan,et al.  Co-retrieval: A Boosted Reranking Approach for Video Retrieval , 2004, CIVR.

[19]  Shih-Fu Chang,et al.  A reranking approach for context-based concept fusion in video indexing and retrieval , 2007, CIVR '07.

[20]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[21]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[22]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[23]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[24]  Rahul Gupta,et al.  Adaptable Similarity Search using Non-Relevant Information , 2002, VLDB.

[25]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[26]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[27]  Paul Over,et al.  TREC video retrieval evaluation TRECVID , 2008 .

[28]  Michael R. Lyu,et al.  A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval , 2008, IEEE Transactions on Multimedia.

[29]  Bernd Freisleben,et al.  Adapting appearance models of semantic concepts to particular videos via transductive learning , 2007, MIR '07.

[30]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[31]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[32]  Rahul Gupta,et al.  Leveraging non-relevant images to enhance image retrieval performance , 2002, MULTIMEDIA '02.

[33]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .

[34]  Christos Faloutsos,et al.  Enhanced max margin learning on multimodal data mining in a multimedia database , 2007, KDD '07.