论文信息 - Efficient Intrinsically Motivated Robotic Grasping with Learning-Adaptive Imagination in Latent Space

Efficient Intrinsically Motivated Robotic Grasping with Learning-Adaptive Imagination in Latent Space

Combining model-based and model-free deep reinforcement learning has shown great promise for improving sample efficiency on complex control tasks while still retaining high performance. Incorporating imagination is a recent effort in this direction inspired by human mental simulation of motor behavior. We propose a learning-adaptive imagination approach which, unlike previous approaches, takes into account the reliability of the learned dynamics model used for imagining the future. Our approach learns an ensemble of disjoint local dynamics models in latent space and derives an intrinsic reward based on learning progress, motivating the controller to take actions leading to data that improves the models. The learned models are used to generate imagined experiences, augmenting the training set of real experiences. We evaluate our approach on learning vision-based robotic grasping and show that it significantly improves sample efficiency and achieves near-optimal performance in a sparse reward environment.

[1] Surya P. N. Singh,et al. V-REP: A versatile and scalable robot simulation framework , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2] Carolyn Copper,et al. Does mental practice enhance performance , 1994 .

[3] Stefan Wermter,et al. Deep intrinsically motivated continuous actor-critic for efficient robotic visuomotor skill learning , 2018, Paladyn J. Behav. Robotics.

[4] Marlos C. Machado,et al. Count-Based Exploration with the Successor Representation , 2018, AAAI.

[5] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[6] Pierre-Yves Oudeyer,et al. Information-seeking, curiosity, and attention: computational and neural mechanisms , 2013, Trends in Cognitive Sciences.

[7] Kate Saenko,et al. Hierarchical Reinforcement Learning with Hindsight , 2018, ArXiv.

[8] V. Ramachandran,et al. Common coding and dynamic interactions between observed, imagined, and experienced motor and somatosensory activity , 2015, Neuropsychologia.

[9] Hado van Hasselt,et al. Reinforcement Learning in Continuous State and Action Spaces , 2012, Reinforcement Learning.

[10] Gabriel Kalweit,et al. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning , 2017, CoRL.

[11] Sridhar Mahadevan,et al. Imagination Machines: A New Challenge for Artificial Intelligence , 2018, AAAI.

[12] Stefan Wermter,et al. NICO — Neuro-inspired companion: A developmental humanoid robot platform for multimodal interaction , 2017, 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[13] Helge J. Ritter,et al. An instantaneous topological mapping model for correlated stimuli , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[14] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[15] Jessica B. Hamrick,et al. Analogues of mental simulation and imagination in deep learning , 2019, Current Opinion in Behavioral Sciences.

[16] Jürgen Schmidhuber,et al. World Models , 2018, ArXiv.

[17] Cornelius Weber,et al. Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[18] Samuel T. Moulton,et al. Imagining predictions: mental imagery as mental emulation , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[19] Chu Kiong Loo,et al. Topological Q-learning with internally guided exploration for mobile robot navigation , 2015, Neural Computing and Applications.

[20] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..

[21] Francesco Mannella,et al. Know Your Body Through Intrinsic Goals , 2018, Front. Neurorobot..

[22] R. Ptak,et al. The Dorsal Frontoparietal Network: A Core System for Emulated Action , 2017, Trends in Cognitive Sciences.

[23] Chu Kiong Loo,et al. Curiosity-based topological reinforcement learning , 2014, 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[24] Michael I. Jordan,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.

[25] Stefan Wermter,et al. Slowness-based neural visuomotor control with an Intrinsically motivated Continuous Actor-Critic , 2018, ESANN.

[26] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.

[27] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.

[28] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[29] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.

[30] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[31] Stefan Wermter,et al. Curiosity-driven exploration enhances motor skills of continuous actor-critic learner , 2017, 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).

[32] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.

[33] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[34] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[35] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[36] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).