Generative Adversarial User Model for Reinforcement Learning Based Recommendation System
暂无分享,去创建一个
Yuan Qi | Le Song | Hui Li | Xinshi Chen | Shuang Li | Shaohua Jiang | Le Song | Shuang Li | Yuan Qi | Xinshi Chen | Hui Li | Shaohua Jiang
[1] Jianfeng Gao,et al. Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads , 2016, EMNLP.
[2] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[3] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[4] Nicholas Jing Yuan,et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.
[5] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[6] Carl E. Rasmussen,et al. Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[7] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[8] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[9] Dietmar Jannach,et al. When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation , 2017, RecSys.
[10] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[11] D. McFadden. Conditional logit analysis of qualitative choice behavior , 1972 .
[12] Tianqi Chen,et al. XGBoost: A Scalable Tree Boosting System , 2016, KDD.
[13] Alexandros Karatzoglou,et al. Session-based Recommendations with Recurrent Neural Networks , 2015, ICLR.
[14] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[15] Liang Zhang,et al. Deep Reinforcement Learning for List-wise Recommendations , 2017, ArXiv.
[16] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.
[17] Nadine Le Fort-Piat,et al. Reward Function and Initial Values: Better Choices for Accelerated Goal-Directed Reinforcement Learning , 2006, ICANN.
[18] Yang Yu,et al. Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning , 2018, AAAI.
[19] Liang Zhang,et al. Deep reinforcement learning for page-wise recommendations , 2018, RecSys.
[20] Heng-Tze Cheng,et al. Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.
[21] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[22] Yunming Ye,et al. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.
[23] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[26] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[27] Sergey Levine,et al. Learning to Adapt: Meta-Learning for Model-Based Control , 2018, ArXiv.
[28] Shuang-Hong Yang,et al. Collaborative competitive filtering: learning recommender using context of user choice , 2011, SIGIR.
[29] Stefano Ermon,et al. Model-Free Imitation Learning with Policy Optimization , 2016, ICML.
[30] C. Manski. MAXIMUM SCORE ESTIMATION OF THE STOCHASTIC UTILITY MODEL OF CHOICE , 1975 .