Improving Generalization of Reinforcement Learning with Minimax Distributional Soft Actor-Critic
暂无分享,去创建一个
[1] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[2] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[3] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[4] Silvio Savarese,et al. Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[5] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[6] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[7] Yang Gao,et al. Risk Averse Robust Adversarial Reinforcement Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).
[8] Qi Sun,et al. Hierarchical Reinforcement Learning for Self-Driving Decision-Making without Reliance on Labeled Driving Data , 2020, IET Intelligent Transport Systems.
[9] Olivier Sigaud,et al. Investigating Generalisation in Continuous Deep Reinforcement Learning , 2019, ArXiv.
[10] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[11] Michael C. Fu,et al. Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint , 2018, ArXiv.
[12] Shengbo Eben Li,et al. Addressing Value Estimation Errors in Reinforcement Learning with a State-Action Return Distribution Function , 2020, ArXiv.
[13] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[14] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.
[15] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[16] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[17] Qi Sun,et al. Centralized Conflict-free Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization , 2019, ArXiv.
[18] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[19] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[20] Dawn Xiaodong Song,et al. Assessing Generalization in Deep Reinforcement Learning , 2018, ArXiv.
[21] Girish Chowdhary,et al. Robust Deep Reinforcement Learning with Adversarial Attacks , 2017, AAMAS.
[22] Abhinav Gupta,et al. Robust Adversarial Reinforcement Learning , 2017, ICML.
[23] Sergey Levine,et al. Composable Deep Reinforcement Learning for Robotic Manipulation , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[24] Shie Mannor,et al. Optimizing the CVaR via Sampling , 2014, AAAI.
[25] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[26] Marc G. Bellemare,et al. A Comparative Analysis of Expected and Distributional Reinforcement Learning , 2019, AAAI.
[27] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..