Online adaptation of uncertain models using neural network priors and partially observable planning
暂无分享,去创建一个
Christian Goerick | Akinobu Hayashi | Dirk Ruiken | Tadaaki Hasegawa | Dirk Ruiken | C. Goerick | Akinobu Hayashi | T. Hasegawa
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Nicholas Roy,et al. PUMA: Planning Under Uncertainty with Macro-Actions , 2010, AAAI.
[3] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[4] Sergey Levine,et al. One-shot learning of manipulation skills with online dynamics adaptation and neural network priors , 2015, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[5] Gaurav S. Sukhatme,et al. Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning , 2017, ICML.
[6] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[7] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[8] Rouhollah Rahmatizadeh,et al. From Virtual Demonstration to Real-World Manipulation Using LSTM and MDN , 2016, AAAI.
[9] Jun Nakanishi,et al. Movement imitation with nonlinear dynamical systems in humanoid robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[10] Wolfram Burgard,et al. A Probabilistic Framework for Learning Kinematic Models of Articulated Objects , 2011, J. Artif. Intell. Res..
[11] David Wingate,et al. A Physics-Based Model Prior for Object-Oriented MDPs , 2014, ICML.
[12] C. Bishop. Mixture density networks , 1994 .
[13] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[14] Emanuel Todorov,et al. Physically consistent state estimation and system identification for contacts , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).
[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[16] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[17] Pieter Abbeel,et al. Deep learning helicopter dynamics models , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[18] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[19] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.
[20] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[21] Leslie Pack Kaelbling,et al. Belief space planning assuming maximum likelihood observations , 2010, Robotics: Science and Systems.
[22] Greg Turk,et al. Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.