MOFSRank: A Multiobjective Evolutionary Algorithm for Feature Selection in Learning to Rank

Learning to rank has attracted increasing interest in the past decade, due to its wide applications in the areas like document retrieval and collaborative filtering. Feature selection for learning to rank is to select a small number of features from the original large set of features which can ensure a high ranking accuracy, since in many real ranking applications many features are redundant or even irrelevant. To this end, in this paper, a multiobjective evolutionary algorithm, termed MOFSRank, is proposed for feature selection in learning to rank which consists of three components. First, an instance selection strategy is suggested to choose the informative instances from the ranking training set, by which the redundant data is removed and the training efficiency is enhanced. Then on the selected instance subsets, a multiobjective feature selection algorithm with an adaptive mutation is developed, where good feature subsets are obtained by selecting the features with high ranking accuracy and low redundancy. Finally, an ensemble strategy is also designed in MOFSRank, which utilizes these obtained feature subsets to produce a set of better features. Experimental results on benchmark data sets confirm the advantage of the proposed method in comparison with the state-of-the-arts.

[1]  Mengjie Zhang,et al.  Multiple reference points MOEA/D for feature selection , 2017, GECCO.

[2]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[3]  Ye Tian,et al.  A Decision Variable Clustering-Based Evolutionary Algorithm for Large-Scale Many-Objective Optimization , 2018, IEEE Transactions on Evolutionary Computation.

[4]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[5]  Hongfei Lin,et al.  Learning to rank using smoothing methods for language modeling , 2013, J. Assoc. Inf. Sci. Technol..

[6]  Yaochu Jin,et al.  Pattern Recommendation in Task-oriented Applications: A Multi-Objective Perspective [Application Notes] , 2017, IEEE Computational Intelligence Magazine.

[7]  Wook-Shin Han,et al.  Efficient feature weighting methods for ranking , 2009, CIKM.

[8]  Juan-Zi Li,et al.  A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure , 2015, Inf. Sci..

[9]  Raffaele Perego,et al.  Fast Feature Selection for Learning to Rank , 2016, ICTIR.

[10]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[11]  Josiane Mothe,et al.  Nonconvex Regularizations for Feature Selection in Ranking With Sparse SVM , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Ismail Sengör Altingövde,et al.  Exploiting Result Diversification Methods for Feature Selection in Learning to Rank , 2014, ECIR.

[13]  Jie Wu,et al.  Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm , 2013, IEEE Transactions on Computers.

[14]  Hanjiang Lai,et al.  EGRank: An exponentiated gradient algorithm for sparse learning-to-rank , 2018, Inf. Sci..

[15]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[16]  T. Pahikkala Greedy RankRLS : a Linear Time Algorithm for Learning Sparse Ranking Models , 2010 .

[17]  Y. Rui,et al.  Learning to Rank Using User Clicks and Visual Features for Image Retrieval , 2015, IEEE Transactions on Cybernetics.

[18]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[19]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[20]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[21]  Martha Larson,et al.  Collaborative Filtering beyond the User-Item Matrix , 2014, ACM Comput. Surv..

[22]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[23]  Thierson Couto,et al.  Incorporating Risk-Sensitiveness into Feature Selection for Learning to Rank , 2016, CIKM.

[24]  Yiqun Liu,et al.  Hierarchical feature selection for ranking , 2010, WWW '10.

[25]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[26]  S. Sathiya Keerthi,et al.  Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[27]  Francisco Herrera,et al.  A multi-objective evolutionary approach to training set selection for support vector machine , 2018, Knowl. Based Syst..

[28]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[29]  Mohammad Reza Keyvanpour,et al.  A feature selection method based on minimum redundancy maximum relevance for learning to rank , 2015, 2015 AI & Robotics (IRANOPEN).

[30]  Xinzhi Han,et al.  Feature Selection and Model Comparison on Microsoft Learning-to-Rank Data Sets , 2018, ArXiv.

[31]  Mohammad Reza Keyvanpour,et al.  A Systematic Study of Feature Selection Methods for Learning to Rank Algorithms , 2018, Int. J. Inf. Retr. Res..

[32]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[33]  Marco Laumanns,et al.  SPEA2: Improving the Strength Pareto Evolutionary Algorithm For Multiobjective Optimization , 2002 .

[34]  Dae-Won Kim,et al.  Effective Evolutionary Multilabel Feature Selection under a Budget Constraint , 2018, Complex..

[35]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[36]  W. Bruce Croft,et al.  Feature Selection for Document Ranking using Best First Search and Coordinate Ascent , 2010 .

[37]  Bruno Martins,et al.  Learning to rank academic experts in the DBLP dataset , 2015, Expert Syst. J. Knowl. Eng..

[38]  Tao Qin,et al.  Robust sparse rank learning for non-smooth ranking measures , 2009, SIGIR.

[39]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[40]  Yu Wu,et al.  Universal partially evolved parallelization of MOEA/D for multi-objective optimization on message-passing clusters , 2017, Soft Comput..

[41]  Tao Qin,et al.  Feature selection for ranking , 2007, SIGIR.

[42]  Zhiyong Lu,et al.  DNorm: disease name normalization with pairwise learning to rank , 2013, Bioinform..

[43]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[44]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[45]  Yong Tang,et al.  FSMRank: Feature Selection Algorithm for Learning to Rank , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Feng Pan,et al.  Feature selection for ranking using boosted trees , 2009, CIKM.