暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Matthieu Geist,et al. What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study , 2021, ICLR.
[3] Balaraman Ravindran,et al. Dynamic Action Repetition for Deep Reinforcement Learning , 2017, AAAI.
[4] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[5] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[6] Yann Ollivier,et al. Making Deep Q-learning methods robust to time discretization , 2019, ICML.
[7] Rémi Munos,et al. Reinforcement Learning for Continuous Stochastic Control Problems , 1997, NIPS.
[8] Elliot Meyerson,et al. Frame Skip Is a Powerful Parameter for Learning to Play Atari , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[9] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[10] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[11] Tuo Zhao,et al. Deep Reinforcement Learning with Robust and Smooth Policy , 2020, ICML.
[12] James Bergstra,et al. Autoregressive Policies for Continuous Control Deep Reinforcement Learning , 2019, IJCAI.
[13] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[14] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[15] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[16] Rahul Singh,et al. Improving Robustness via Risk Averse Distributional Reinforcement Learning , 2020, L4DC.
[17] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[18] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[19] Marcello Restelli,et al. Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning , 2020, ICML.
[20] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[21] Vitaly Levdik,et al. Time Limits in Reinforcement Learning , 2017, ICML.
[22] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[23] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[24] Wulfram Gerstner,et al. Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons , 2013, PLoS Comput. Biol..
[25] Rémi Munos,et al. Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[27] Mohammad Norouzi,et al. Mastering Atari with Discrete World Models , 2020, ICLR.
[28] Finale Doshi-Velez,et al. Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs , 2020, NeurIPS.
[29] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[30] R. Rockafellar,et al. Conditional Value-at-Risk for General Loss Distributions , 2001 .
[31] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[32] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.