暂无分享,去创建一个
Yuval Tassa | Alessandro Davide Ialongo | Martin A. Riedmiller | Abbas Abdolmaleki | Jost Tobias Springenberg | Nicolas Heess | Martin Riedmiller | Leonard Hasenclever | Josh Merel | Arunkumar Byravan | M. Berk Mirza | Piotr Trochim | Mehdi Mirza | N. Heess | Yuval Tassa | J. Merel | A. Abdolmaleki | Leonard Hasenclever | Arunkumar Byravan | Piotr Trochim | J. T. Springenberg | P. Trochim
[1] Byron Boots,et al. Blending MPC & Value Function Approximation for Efficient Reinforcement Learning , 2020, ArXiv.
[2] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[3] Andrew Gordon Wilson,et al. On the model-based stochastic value gradient for continuous reinforcement learning , 2020, L4DC.
[4] Roozbeh Mottaghi,et al. Rearrangement: A Challenge for Embodied AI , 2020, ArXiv.
[5] Gabriel Dulac-Arnold,et al. Model-Based Offline Planning , 2020, ArXiv.