Semi-Supervised Ensemble Ranking

Ranking plays a central role in many Web search and information retrieval applications. Ensemble ranking, sometimes called meta-search, aims to improve the retrieval performance by combining the outputs from multiple ranking algorithms. Many ensemble ranking approaches employ supervised learning techniques to learn appropriate weights for combining multiple rankers. The main shortcoming with these approaches is that the learned weights for ranking algorithms are query independent. This is suboptimal since a ranking algorithm could perform well for certain queries but poorly for others. In this paper, we propose a novel semi-supervised ensemble ranking (SSER) algorithm that learns query-dependent weights when combining multiple rankers in document retrieval. The proposed SSER algorithm is formulated as an SVM-like quadratic program (QP), and therefore can be solved efficiently by taking advantage of optimization techniques that were widely used in existing SVM solvers. We evaluated the proposed technique on a standard document retrieval testbed and observed encouraging results by comparing to a number of state-of-the-art techniques.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[3]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[4]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[5]  Tao Qin,et al.  Supervised rank aggregation , 2007, WWW '07.

[6]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[7]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[8]  Shivani Agarwal,et al.  Ranking on graph data , 2006, ICML.

[9]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[10]  Soumen Chakrabarti,et al.  Learning random walks to rank nodes in graphs , 2007, ICML '07.

[11]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[12]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[13]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[14]  Massih-Reza Amini,et al.  Ranking with Unlabeled Data: A First Study , 2005 .

[15]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[16]  Tao Qin,et al.  Ranking with multiple hyperplanes , 2007, SIGIR.

[17]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[18]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[19]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[20]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[21]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[22]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[24]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[25]  Tie-Yan Liu,et al.  Adapting ranking SVM to document retrieval , 2006, SIGIR.

[26]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[27]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.