Improved Learning of Robot Manipulation Tasks Via Tactile Intrinsic Motivation

In this letter we address the challenge of exploration in deep reinforcement learning for robotic manipulation tasks. In sparse goal settings, an agent does not receive any positive feedback until randomly achieving the goal, which becomes infeasible for longer control sequences. Inspired by touch-based exploration observed in children, we formulate an intrinsic reward based on the sum of forces between a robot's force sensors and manipulation objects that encourages physical interaction. Furthermore, we introduce contact-prioritized experience replay, a sampling scheme that prioritizes contact rich episodes and transitions. We show that our solution accelerates the exploration and outperforms state-of-the-art methods on three fundamental robot manipulation benchmarks.

[1]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[2]  A. Landi Human Hand Function , 2007 .

[3]  Amos J. Storkey,et al.  Exploration by Random Network Distillation , 2018, ICLR.

[4]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[5]  Salima Hassas,et al.  A survey on intrinsic motivation in reinforcement learning , 2019, ArXiv.

[6]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[7]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[9]  Volker Tresp,et al.  Energy-Based Hindsight Experience Prioritization , 2018, CoRL.

[10]  Jessica B. Hamrick,et al.  Exploring Exploration: Comparing Children with RL Agents in Unified Environments , 2020, ArXiv.

[11]  Timo Korthals,et al.  Tactile Sensing and Deep Reinforcement Learning for In-Hand Manipulation Tasks , 2019 .

[12]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[13]  Sergey Levine,et al.  Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[14]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[15]  Sandy H. Huang,et al.  Learning Gentle Object Manipulation with Curiosity-Driven Deep Reinforcement Learning , 2019, ArXiv.

[16]  Pierre-Yves Oudeyer,et al.  Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..

[17]  Bohan Wu,et al.  MAT: Multi-Fingered Adaptive Tactile Grasping via Deep Reinforcement Learning , 2019, CoRL.

[18]  Kenneth O. Stanley,et al.  Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.

[19]  Lydia E. Kavraki,et al.  How Much Do Unstated Problem Constraints Limit Deep Robotic Reinforcement Learning? , 2019, ArXiv.

[20]  Marcin Andrychowicz,et al.  Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.

[21]  Antonis A. Argyros,et al.  Towards force sensing from vision: Observing hand-object interactions to infer manipulation forces , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[24]  Ludovic Righetti,et al.  Leveraging Contact Forces for Learning to Grasp , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[25]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[26]  Matthew E. Taylor,et al.  Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..

[27]  Emre Aksan,et al.  Learning Functionally Decomposed Hierarchies for Continuous Control Tasks , 2020, ArXiv.

[28]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[29]  Tom Schaul,et al.  Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.

[30]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.