暂无分享,去创建一个
[1] Deborah Estrin,et al. Unbiased offline recommender evaluation for missing-not-at-random implicit feedback , 2018, RecSys.
[2] Hongning Wang,et al. Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation , 2019, NeurIPS.
[3] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[4] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[5] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[6] John Langford,et al. Off-policy evaluation for slate recommendation , 2016, NIPS.
[7] Xing Xie,et al. Session-based Recommendation with Graph Neural Networks , 2018, AAAI.
[8] Diksha Garg,et al. Sequence and Time Aware Neighborhood for Session-based Recommendations: STAN , 2019, SIGIR.
[9] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[10] Liang Zhang,et al. Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning , 2018, KDD.
[11] Ulf Brefeld,et al. Factored MDPs for detecting topics of user sessions , 2014, RecSys '14.
[12] Sergey Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[13] Liang Zhang,et al. Deep Reinforcement Learning for List-wise Recommendations , 2017, ArXiv.
[14] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[16] Masashi Sugiyama,et al. Nonparametric Return Distribution Approximation for Reinforcement Learning , 2010, ICML.
[17] Craig Boutilier,et al. Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology , 2019, ArXiv.
[18] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[19] Omer Levy,et al. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.
[20] Dietmar Jannach,et al. Evaluation of session-based recommendation algorithms , 2018, User Modeling and User-Adapted Interaction.
[21] Nando de Freitas,et al. Critic Regularized Regression , 2020, NeurIPS.
[22] Craig Boutilier,et al. RecSim: A Configurable Simulation Platform for Recommender Systems , 2019, ArXiv.
[23] Alexandros Karatzoglou,et al. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations , 2017, CIKM.
[24] P. J. Huber. Robust Estimation of a Location Parameter , 1964 .
[25] Diksha Garg,et al. NISER: Normalized Item and Session Representations to Handle Popularity Bias. , 2019 .
[26] Tao Li,et al. Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains , 2020, ArXiv.
[27] Gang Chen,et al. Off-Policy Recommendation System Without Exploration , 2020, PAKDD.
[28] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[29] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[30] Natasha Jaques,et al. Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog , 2019, ArXiv.
[31] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[32] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Deep Reinforcement Learning , 2020, International Conference on Machine Learning.
[33] Yuan Qi,et al. Generative Adversarial User Model for Reinforcement Learning Based Recommendation System , 2018, ICML.
[34] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.
[35] Liang Zhang,et al. Deep reinforcement learning for page-wise recommendations , 2018, RecSys.
[36] Diksha Garg,et al. NISER: Normalized Item and Session Representations with Graph Neural Networks , 2019, ArXiv.
[37] Qiao Liu,et al. STAMP: Short-Term Attention/Memory Priority Model for Session-based Recommendation , 2018, KDD.
[38] Joelle Pineau,et al. Benchmarking Batch Deep Reinforcement Learning Algorithms , 2019, ArXiv.
[39] Nicholas Jing Yuan,et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.
[40] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2020, ICML.
[41] Martha White,et al. An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning , 2015, J. Mach. Learn. Res..
[42] Frederick R. Forst,et al. On robust estimation of the location parameter , 1980 .
[43] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[44] Nando de Freitas,et al. Hyperparameter Selection for Offline Reinforcement Learning , 2020, ArXiv.
[45] George Tucker,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[46] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[47] Thorsten Joachims,et al. MOReL : Model-Based Offline Reinforcement Learning , 2020, NeurIPS.