Content-aware Ranking for visual search

The ranking models of existing image/video search engines are generally based on associated text while the visual content is actually neglected. Imperfect search results frequently appear due to the mismatch between the textual features and the actual visual content. Visual reranking, in which visual information is applied to refine text based search results, has been proven to be effective. However, the improvement brought by visual reranking is limited, and the main reason is that the errors in the text-based results will propagate to the refinement stage. In this paper, we propose a Content-Aware Ranking model based on “learning to rank” framework, in which textual and visual information are simultaneously leveraged in the ranking learning process. We formulate the Content-Aware Ranking based on large margin structured output learning, by modeling the visual information into a regularization term. The direct optimization of the learning problem is nearly infeasible since the number of constraints is huge. The efficient cutting plane algorithm is adopted to learn the model by iteratively adding the most violated constraints. Extensive experimental results on a large-scale dataset collected from a commercial Web image search engine, as well as the TRECVID 2007 video search dataset, demonstrate that the proposed ranking model significantly outperforms the state-of-the-art ranking and reranking methods.

[1]  Thomas Hofmann,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, NIPS.

[2]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[3]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[4]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[7]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[8]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[9]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[10]  M. V. Wilkes,et al.  The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .

[11]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Xian-Sheng Hua,et al.  Bayesian video search reranking , 2008, ACM Multimedia.

[13]  Xiaoou Tang,et al.  Real time google and live image search re-ranking , 2008, ACM Multimedia.

[14]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[15]  Stephen E. Robertson,et al.  The TREC-9 filtering track , 1999, SIGF.

[16]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[17]  Xian-Sheng Hua,et al.  Online multi-label active annotation: towards large-scale content-based video search , 2008, ACM Multimedia.

[18]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[19]  Shih-Fu Chang,et al.  Video search reranking through random walk over document-level context graph , 2007, ACM Multimedia.

[20]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[21]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[22]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.