Graph-Based Pairwise Learning to Rank for Video Search

Learning-based ranking is a promising approach to a variety of search tasks, which is aimed at automatically creating the ranking model based on training samples and machine learning techniques. However, the problem of lacking training samples labeled with relevancy degree or ranking orders is frequently encountered. To address this problem, we propose a novel graph-based learning to rank (GLRank) for video search by leveraging the vast amount of unlabeled samples. A relation graph is constructed by using sample (i.e., video shot) pairs rather than individual samples as vertices. Each vertex in this graph represents the "relevancy relation" between two samples in a pair (i.e., which sample is more relevant to the given query). Such relevancy relation is discovered through a set of pre-trained concept detectors and then propagated among the pairs. When all the pairs, constructed with the samples to be searched, receive the propagated relevancy relation, a round robin criterion is proposed to obtain the final ranking list. We have conducted comprehensive experiments on automatic video search task over TRECVID 2005-2007 benchmarks and shown significant and consistent improvements over the other state-of-the-art ranking approaches.

[1]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[2]  Tao Mei,et al.  Query-independent learning for video search , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[3]  Xian-Sheng Hua,et al.  Video Annotation Based on Kernel Linear Neighborhood Propagation , 2008, IEEE Transactions on Multimedia.

[4]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[5]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[6]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[7]  Rong Yan,et al.  Efficient Margin-Based Rank Learning Algorithms for Information Retrieval , 2006, CIVR.

[8]  Edward A. Fox,et al.  Support Vector Machines to Weight Voters in a Voting System of Entity Extractors , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[9]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[10]  R. Yager On a general class of fuzzy connectives , 1980 .

[11]  Tao Mei,et al.  Learning to video search rerank via pseudo preference feedback , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[12]  Wei-Ying Ma,et al.  Image and Video Retrieval , 2003, Lecture Notes in Computer Science.

[13]  John R. Smith,et al.  Cluster-based data modeling for semantic video search , 2007, CIVR '07.

[14]  Dong Wang,et al.  Video search in concept subspace: a text-like paradigm , 2007, CIVR '07.

[15]  Meng Wang,et al.  MSRA-USTC-SJTU at TRECVID 2007: High-Level Feature Extraction and Search , 2007, TRECVID.

[16]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.