A deep reinforcement learning framework for allocating buyer impressions in e-commerce websites
暂无分享,去创建一个
Yiwei Zhang | Pingzhong Tang | Aris Filos-Ratsikas | Qingpeng Cai | Aris Filos-Ratsikas | Pingzhong Tang | Qingpeng Cai | Yiwei Zhang
[1] A. Rubinstein. Modeling Bounded Rationality , 1998 .
[2] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[3] Constantinos Daskalakis,et al. Learning in Auctions: Regret is Hard, Envy is Easy , 2015, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).
[4] Pingzhong Tang,et al. Mechanism Design for Personalized Recommender Systems , 2016, RecSys.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Robert Babuska,et al. Experience Replay for Real-Time Reinforcement Learning Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[7] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[8] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[9] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[10] Lior Rokach,et al. Introduction to Recommender Systems Handbook , 2011, Recommender Systems Handbook.
[11] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.
[12] Alan A. Stocker,et al. Human Decision-Making under Limited Time , 2016, NIPS.
[13] Eric Maskin,et al. Mechanism Design: How to Implement Social Goals , 2008 .
[14] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[15] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 2004, Machine Learning.
[16] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[17] Patrick M. Pilarski,et al. Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).
[18] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[19] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[20] Éva Tardos,et al. Learning and Efficiency in Games with Dynamic Population , 2015, SODA.
[21] Éva Tardos,et al. No-Regret Learning in Bayesian Games , 2015, NIPS.
[22] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[23] Shalabh Bhatnagar,et al. Incremental Natural Actor-Critic Algorithms , 2007, NIPS.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Éva Tardos,et al. Econometrics for Learning Agents , 2015, EC.