Towards TempoRL: Learning When to Act
暂无分享,去创建一个
[1] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[2] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[4] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[5] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[6] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[7] Elliot Meyerson,et al. Frame Skip Is a Powerful Parameter for Learning to Play Atari , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[10] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[11] Marc W. Howard,et al. Scale Invariant Value Computation for Reinforcement Learning in Continuous Time , 2017, AAAI Spring Symposia.
[12] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[13] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[14] Balaraman Ravindran,et al. Dynamic Action Repetition for Deep Reinforcement Learning , 2017, AAAI.
[15] Shie Mannor,et al. Learning Robust Options , 2018, AAAI.
[16] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[17] Doina Precup,et al. Learning with Options that Terminate Off-Policy , 2017, AAAI.
[18] Feng Jiang,et al. Optimal Skipping Rates: Training Agents with Fine-Grained Control Using Deep Reinforcement Learning , 2019, J. Robotics.
[19] Doina Precup,et al. Learning Options with Interest Functions , 2019, AAAI.
[20] Quanyan Zhu,et al. Continuous-Time Markov Decision Processes with Controlled Observations , 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[21] Igor Mordatch,et al. Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.