Ranking for Relevance and Display Preferences in Complex Presentation Layouts

Learning to Rank has traditionally considered settings where given the relevance information of objects, the desired order in which to rank the objects is clear. However, with today's large variety of users and layouts this is not always the case. In this paper, we consider so-called complex ranking settings where it is not clear what should be displayed, that is, what the relevant items are, and how they should be displayed, that is, where the most relevant items should be placed. These ranking settings are complex as they involve both traditional ranking and inferring the best display order. Existing learning to rank methods cannot handle such complex ranking settings as they assume that the display order is known beforehand. To address this gap we introduce a novel Deep Reinforcement Learning method that is capable of learning complex rankings, both the layout and the best ranking given the layout, from weak reward signals. Our proposed method does so by selecting documents and positions sequentially, hence it ranks both the documents and positions, which is why we call it the Double Rank Model (DRM). Our experiments show that DRM outperforms all existing methods in complex ranking settings, thus it leads to substantial ranking improvements in cases where the display order is not known a priori.

[1]  Yiqun Liu,et al.  Why People Search for Images using Web Search Engines , 2017, WSDM.

[2]  Filip Radlinski,et al.  Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[3]  Yi Chang,et al.  Yahoo! Learning to Rank Challenge Overview , 2010, Yahoo! Learning to Rank Challenge.

[4]  Shubhra Kanti Karmaker Santu,et al.  On Application of Learning to Rank for E-Commerce Search , 2017, SIGIR.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[7]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[8]  Wei Zeng,et al.  Adapting Markov Decision Process for Search Result Diversification , 2017, SIGIR.

[9]  Yukihiro Tagami,et al.  CTR prediction for contextual advertising: learning-to-rank approach , 2013, ADKDD '13.

[10]  Jiafeng Guo,et al.  Reinforcement Learning to Rank with Markov Decision Process , 2017, SIGIR.

[11]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[12]  Tao Qin,et al.  Introducing LETOR 4.0 Datasets , 2013, ArXiv.

[13]  Meng Wang,et al.  Investigating Examination Behavior of Image Search Users , 2017, SIGIR.

[14]  Ashwin Satyanarayana,et al.  Evaluating whole-page relevance , 2010, SIGIR '10.

[15]  Alexandros Karatzoglou,et al.  Learning to rank for recommender systems , 2013, RecSys.

[16]  Yue Wang,et al.  Beyond Ranking: Optimizing Whole-Page Presentation , 2016, WSDM.

[17]  A. Pellicer‐Sánchez INCIDENTAL L2 VOCABULARY ACQUISITION FROM AND WHILE READING , 2015, Studies in Second Language Acquisition.

[18]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[19]  Shinichi Nakajima,et al.  Global analytic solution of fully-observed variational Bayesian matrix factorization , 2013, J. Mach. Learn. Res..

[20]  M. de Rijke,et al.  Large-scale Validation of Counterfactual Learning Methods: A Test-Bed , 2016, ArXiv.

[21]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Bernard J. Jansen,et al.  Factors relating to the decision to click on a sponsored link , 2007, Decis. Support Syst..

[24]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[25]  Susan T. Dumais,et al.  The good, the bad, and the random: an eye-tracking study of ad quality in web search , 2010, SIGIR.

[26]  Fan Zhang,et al.  Evaluating Mobile Search with Height-Biased Gain , 2017, SIGIR.

[27]  Nikhil R. Devanur,et al.  Whole-page optimization and submodular welfare maximization with online bidders , 2013, EC '13.

[28]  Panayiotis Zaphiris,et al.  Towards Predicting Ad Effectiveness via an Eye Tracking Study , 2014, HCI.

[29]  Salvatore Orlando,et al.  Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees , 2016, ACM Trans. Inf. Syst..

[30]  Fernando Diaz,et al.  Whole page optimization: how page elements interact with the position auction , 2014, EC.

[31]  Yunde Jia,et al.  An Eye-Tracking Study of User Behavior in Web Image Search , 2014, PRICAI.

[32]  M. de Rijke,et al.  Incorporating Clicks, Attention and Satisfaction into a Search Engine Result Page Evaluation Model , 2016, CIKM.