Incorporating query constraints for autoencoder enhanced ranking

Abstract Learning to rank has been widely used in information retrieval tasks to construct ranking models for document retrieval. Existing learning to rank methods adopt supervised machine learning methods as core techniques and classical retrieval models as document features. The quality of document features can significantly affect the effectiveness of ranking models. Therefore, it is necessary to generate effective document features in ranking to extend the feature space of learning to rank for better modeling the relevance between queries and their corresponding documents. Recently, deep neural network models have been used to generate effective features for various text mining tasks. Autoencoders, as one type of building blocks of neural networks, capture semantic information as effective features based on an encoder-decoder framework. In this paper, we incorporate autoencoders into constructing ranking models based on learning to rank. In our method, autoencoders are used to generate effective documents features for capturing semantic information of documents. We propose a query-level semi-supervised autoencoder by considering three types of query constraints based on Bregman divergence. We evaluate the effectiveness of our model on datasets from LETOR 3.0 and LETOR 4.0, and show that our model significantly outperforms other competing methods to improve retrieval performance.

[1]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[2]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[3]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[4]  Dit-Yan Yeung,et al.  Relational Stacked Denoising Autoencoder for Tag Recommendation , 2015, AAAI.

[5]  W. Bruce Croft,et al.  Learning a Deep Listwise Context Model for Ranking Refinement , 2018, SIGIR.

[6]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[7]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[8]  Huazheng Wang,et al.  Efficient Exploration of Gradient Space for Online Learning to Rank , 2018, SIGIR.

[9]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[10]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[11]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[12]  Salvatore Orlando,et al.  Selective Gradient Boosting for Effective Learning to Rank , 2018, SIGIR.

[13]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[14]  Djoerd Hiemstra,et al.  A cross-benchmark comparison of 87 learning to rank methods , 2015, Inf. Process. Manag..

[15]  M. de Rijke,et al.  Multileave Gradient Descent for Fast Online Learning to Rank , 2016, WSDM.

[16]  Min Yang,et al.  Investigating Deep Reinforcement Learning Techniques in Personalized Dialogue Generation , 2018, SDM.

[17]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[18]  Jun Xu,et al.  Modeling Diverse Relevance Patterns in Ad-hoc Retrieval , 2018, SIGIR.

[19]  Hongbo Deng,et al.  Ranking Relevance in Yahoo Search , 2016, KDD.

[20]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[21]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[22]  Min Yang,et al.  Feature-enhanced attention network for target-dependent sentiment classification , 2018, Neurocomputing.

[23]  Peng Zhang,et al.  IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.

[24]  Kyomin Jung,et al.  Learning to Rank Question-Answer Pairs Using Hierarchical Recurrent Encoder with Latent Topic Clustering , 2017, NAACL.

[25]  Hui Xiong,et al.  Representation Learning via Semi-Supervised Autoencoder for Multi-task Learning , 2015, 2015 IEEE International Conference on Data Mining.

[26]  Xiaoyu Du,et al.  Adversarial Personalized Ranking for Recommendation , 2018, SIGIR.

[27]  Thorsten Joachims,et al.  Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.

[28]  Zhongfei Zhang,et al.  Semisupervised Autoencoder for Sentiment Analysis , 2015, AAAI.

[29]  Huan Liu,et al.  Turning Clicks into Purchases: Revenue Optimization for Product Search in E-Commerce , 2018, SIGIR.

[30]  Christian S. Jensen,et al.  Efficient Online Summarization of Large-Scale Dynamic Networks , 2016, IEEE Transactions on Knowledge and Data Engineering.

[31]  Scott Sanner,et al.  AutoRec: Autoencoders Meet Collaborative Filtering , 2015, WWW.

[32]  Byron J. Gao,et al.  Learning to rank for hybrid recommendation , 2012, CIKM.

[33]  Nicholas Jing Yuan,et al.  Representation Learning with Pair-wise Constraints for Collaborative Ranking , 2017, WSDM.

[34]  Bin Wang,et al.  Learning to rank for question routing in community question answering , 2013, CIKM.

[35]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[36]  Hongfei Lin,et al.  Learning to Rank with Query-level Semi-supervised Autoencoders , 2017, CIKM.

[37]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[38]  Christian S. Jensen,et al.  Integrating non-spatial preferences into spatial location queries , 2014, SSDBM '14.

[39]  Siyuan Liu,et al.  Heterogeneous anomaly detection in social diffusion with discriminative feature discovery , 2018, Inf. Sci..

[40]  Tao Qin,et al.  Introducing LETOR 4.0 Datasets , 2013, ArXiv.

[41]  Tie-Yan Liu Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..

[42]  Quoc V. Le,et al.  Abstract , 2003, Appetite.

[43]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[44]  Xueqi Cheng,et al.  What makes data robust: a data analysis in learning to rank , 2014, SIGIR.

[45]  Wei Zeng,et al.  From Greedy Selection to Exploratory Decision-Making: Diverse Ranking with Policy-Value Networks , 2018, SIGIR.