A deep actor critic reinforcement learning framework for learning to rank

[1]  D. Wang,et al.  Reward Shaping-Based Actor–Critic Deep Reinforcement Learning for Residential Energy Management , 2023, IEEE Transactions on Industrial Informatics.

[2]  Jun Wang,et al.  MarlRank: Multi-agent Reinforced Learning to Rank , 2019, CIKM.

[3]  Huazheng Wang,et al.  Variance Reduction in Gradient Exploration for Online Learning to Rank , 2019, SIGIR.

[4]  A. Shakery,et al.  ERR.Rank: An algorithm based on learning to rank for direct optimization of Expected Reciprocal Rank , 2018, Applied Intelligence.

[5]  Wei Zeng,et al.  Multi Page Search with Reinforcement Learning to Rank , 2018, ICTIR.

[6]  Wei Zeng,et al.  From Greedy Selection to Exploratory Decision-Making: Diverse Ranking with Policy-Value Networks , 2018, SIGIR.

[7]  Xiaoyan Zhu,et al.  Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning , 2018, WWW.

[8]  Ke Wang,et al.  Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding , 2018, WSDM.

[9]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[10]  Jiafeng Guo,et al.  Reinforcement Learning to Rank with Markov Decision Process , 2017, SIGIR.

[11]  Wei Zeng,et al.  Adapting Markov Decision Process for Search Result Diversification , 2017, SIGIR.

[12]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[13]  Thorsten Joachims,et al.  Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.

[14]  Grace Hui Yang,et al.  Win-win search: dual-agent stochastic game in session search , 2014, SIGIR.

[15]  Éric Gaussier,et al.  A Theoretical Analysis of Pseudo-Relevance Feedback Models , 2013, ICTIR.

[16]  Yiming Yang,et al.  Multilabel classification with meta-level features in a learning-to-rank framework , 2012, Machine Learning.

[17]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[18]  Thorsten Joachims,et al.  Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.

[19]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[20]  Chiranjib Bhattacharyya,et al.  Structured learning for non-smooth ranking losses , 2008, KDD.

[21]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[22]  Stephen E. Robertson,et al.  SoftRank: optimizing non-smooth rank metrics , 2008, WSDM '08.

[23]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[24]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[25]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[26]  Thorsten Joachims,et al.  In Google We Trust: Users' Decisions on Rank, Position, and Relevance , 2007, J. Comput. Mediat. Commun..

[27]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[28]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[29]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[30]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[31]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[32]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[33]  D. W. Zimmerman Comparative Power of Student T Test and Mann-Whitney U Test for Unequal Sample Sizes and Variances , 1987 .

[34]  Filip Radlinski,et al.  Ranked bandits in metric spaces: learning diverse rankings over large document collections , 2013, J. Mach. Learn. Res..

[35]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[36]  Katja Hofmann,et al.  Information Retrieval manuscript No. (will be inserted by the editor) Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval , 2022 .