Policy Optimization Reinforcement Learning with Entropy Regularization
暂无分享,去创建一个
[1] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[2] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[3] Xiaoxiao Guo. Deep Learning and Reward Design for Reinforcement Learning , 2017 .
[4] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[5] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[6] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[7] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[8] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[9] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[10] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[11] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[12] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[13] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[14] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.