暂无分享,去创建一个
Navdeep Jaitly | James Davidson | Luke Metz | Julian Ibarz | Navdeep Jaitly | Julian Ibarz | James Davidson | Luke Metz | N. Jaitly
[1] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[2] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[3] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[4] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[5] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[6] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[8] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[9] Samy Bengio,et al. Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks , 1999, NIPS.
[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[11] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[12] Geoffrey E. Hinton,et al. Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..
[13] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[16] Michail G. Lagoudakis,et al. Binary action search for learning continuous-action control policies , 2009, ICML '09.
[17] Hugo Larochelle,et al. The Neural Autoregressive Distribution Estimator , 2011, AISTATS.
[18] Jason Pazis,et al. Generalized Value Functions for Large Action Sets , 2011, ICML.
[19] Jason Pazis,et al. Reinforcement learning in multidimensional continuous action spaces , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[20] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[21] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[23] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[24] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[25] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.
[26] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[27] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[28] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[29] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[30] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[31] Dilin Wang,et al. Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning , 2016, ArXiv.
[32] Dale Schuurmans,et al. Reward Augmented Maximum Likelihood for Neural Structured Prediction , 2016, NIPS.
[33] Andriy Mnih,et al. Variational Inference for Monte Carlo Objectives , 2016, ICML.
[34] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[35] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[36] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.
[37] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[38] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[39] Dilin Wang,et al. Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.
[40] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[41] Yang Liu,et al. Minimum Risk Training for Neural Machine Translation , 2015, ACL.
[42] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[43] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[44] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[45] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[46] Xi Chen,et al. PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.
[47] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[48] Lei Xu,et al. Input Convex Neural Networks : Supplementary Material , 2017 .
[49] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[50] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[51] Sham M. Kakade,et al. Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.
[52] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[53] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.
[54] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[55] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.