Expected Divergence Based Feature Selection for Learning to Rank

Feature selection methods are essential for learning to rank (LTR) approaches as the number of features are directly proportional to computational cost and sometimes, might lead to the over-fitting of the ranking model. We propose an expected divergence based approach to select a subset of highly discriminating features over relevance categories. The proposed method is evaluated in terms of performance of standard LTR algorithms when trained with reduced features over a set of standard LTR datasets. The proposed method leads to not significantly worse, and in some cases, significantly better performance compared to the baselines with as few features as less than 10%. The proposed method is scalable and can easily be parallelised. TITLE AND ABSTRACT IN ANOTHER LANGUAGE, L2 (OPTIONAL, AND ON SAME PAGE)

[1]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[2]  Feng Pan,et al.  Feature selection for ranking using boosted trees , 2009, CIKM.

[3]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[4]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[5]  Frans Coetzee,et al.  Correcting the Kullback-Leibler distance for feature selection , 2005, Pattern Recognit. Lett..

[6]  Karl-Michael Schneider,et al.  A New Feature Selection Score for Multinomial Naive Bayes Text Classification Based on KL-Divergence , 2004, ACL.

[7]  W. Bruce Croft,et al.  Feature Selection for Document Ranking using Best First Search and Coordinate Ascent , 2010 .

[8]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[9]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[10]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[11]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[12]  Tao Qin,et al.  Feature selection for ranking , 2007, SIGIR.

[13]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[14]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[15]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..