暂无分享,去创建一个
Erin J. Talvitie | Martha White | Ehsan Imani | Taher Jafferjee | Erin Talvitie | Micheal Bowling | Martha White | Ehsan Imani | Taher Jafferjee | Micheal Bowling
[1] Martha White,et al. Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains , 2018, IJCAI.
[2] Erik Talvitie,et al. Self-Correcting Models for Model-Based Reinforcement Learning , 2016, AAAI.
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[5] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[6] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[7] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[10] Shalabh Bhatnagar,et al. Multi-Step Dyna Planning for Policy Evaluation and Control , 2009, NIPS.
[11] Gabriel Kalweit,et al. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning , 2017, CoRL.
[12] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[13] Erik Talvitie,et al. The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces , 2018, ArXiv.
[14] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.
[15] Sergey Levine,et al. Recall Traces: Backtracking Models for Efficient Reinforcement Learning , 2018, ICLR.
[16] Alborz Geramifard,et al. Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping , 2008, UAI.
[17] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[18] Matteo Hessel,et al. When to use parametric models in reinforcement learning? , 2019, NeurIPS.
[19] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..