UvA-DARE (Digital Academic Repository) Stochastic Activation Actor Critic Methods
暂无分享,去创建一个
[1] Michael I. Jordan,et al. Decision-Making with Auto-Encoding Variational Bayes , 2020, NeurIPS.
[2] Yuandong Tian,et al. Latent forward model for Real-time Strategy game planning with incomplete information , 2018 .
[3] R. Adamski,et al. Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes , 2018, ISC.
[4] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[5] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[6] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[7] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[8] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[9] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[10] Max Welling,et al. Bayesian Compression for Deep Learning , 2017, NIPS.
[11] Zheng Wen,et al. Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..
[12] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.
[13] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[14] Sergey Levine,et al. Uncertainty-Aware Reinforcement Learning for Collision Avoidance , 2017, ArXiv.
[15] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[16] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[17] Koray Kavukcuoglu,et al. Combining policy gradient and Q-learning , 2016, ICLR.
[18] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[19] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[20] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[21] J. Schulman,et al. OpenAI Gym , 2016, ArXiv.
[22] Honglak Lee,et al. Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.
[23] Samy Bengio,et al. Revisiting Distributed Synchronous SGD , 2016, ArXiv.
[24] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[25] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[26] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[27] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[28] Diederik P. Kingma,et al. Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.
[29] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[30] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[31] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[32] Michael I. Jordan,et al. Trust Region Policy Optimization , 2015, ICML.
[33] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[34] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[35] Ruslan Salakhutdinov,et al. Learning Stochastic Feedforward Neural Networks , 2013, NIPS.
[36] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[37] Emanuel Todorov,et al. General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.
[38] Brian D. Ziebart,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[39] Carl E. Rasmussen,et al. Gaussian Processes in Reinforcement Learning , 2003, NIPS.
[40] C. Rasmussen,et al. Improving PILCO with Bayesian Neural Network Dynamics Models , 2016 .
[41] Nicholas Roy,et al. The Belief Roadmap: Efficient Planning in Linear POMDPs by Factoring the Covariance , 2007, ISRR.
[42] Kevin P. Murphy,et al. A Survey of POMDP Solution Techniques , 2000 .
[43] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.