Deep Reinforcement Learning for Tactile Robotics: Learning to Type on a Braille Keyboard

Artificial touch would seem well-suited for Reinforcement Learning (RL), since both paradigms rely on interaction with an environment. Here we propose a new environment and set of tasks to encourage development of tactile reinforcement learning: learning to type on a braille keyboard. Four tasks are proposed, progressing in difficulty from arrow to alphabet keys and from discrete to continuous actions. A simulated counterpart is also constructed by sampling tactile data from the physical environment. Using state-of-the-art deep RL algorithms, we show that all of these tasks can be successfully learnt in simulation, and 3 out of 4 tasks can be learned on the real robot. A lack of sample efficiency currently makes the continuous alphabet task impractical on the robot. To the best of our knowledge, this work presents the first demonstration of successfully training deep RL agents in the real world using observations that exclusively consist of tactile images. To aid future research utilising this environment, the code for this project has been released along with designs of the braille keycaps for 3D printing and a guide for recreating the experiments.

[1]  Marcin Andrychowicz,et al.  Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.

[2]  Martin V. Butz,et al.  Self-supervised regrasping using spatio-temporal tactile features and reinforcement learning , 2016, IROS 2016.

[3]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[4]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[5]  Sandy H. Huang,et al.  Learning Gentle Object Manipulation with Curiosity-Driven Deep Reinforcement Learning , 2019, ArXiv.

[6]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[7]  Qiaozhi Wang,et al.  OffWorld Gym: open-access physical robotics environment for real-world reinforcement learning benchmark and research , 2019, ArXiv.

[8]  James Bergstra,et al.  Benchmarking Reinforcement Learning Algorithms on Real-World Robots , 2018, CoRL.

[9]  Jitendra Malik,et al.  More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch , 2018, IEEE Robotics and Automation Letters.

[10]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[11]  Andrew Owens,et al.  Shape-independent hardness estimation using deep learning and a GelSight tactile sensor , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Maria Bauzá,et al.  Tactile Regrasp: Grasp Adjustments via Simulated Tactile Transformations , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Sergey Levine,et al.  Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[15]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[16]  Yuval Tassa,et al.  Maximum a Posteriori Policy Optimisation , 2018, ICLR.

[17]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[18]  Raia Hadsell,et al.  From Pixels to Percepts: Highly Robust Edge Perception and Contour Following Using Deep Learning and an Optical Biomimetic Tactile Sensor , 2018, IEEE Robotics and Automation Letters.

[19]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[20]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[21]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[22]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[23]  Gaurav S. Sukhatme,et al.  Self-supervised regrasping using spatio-temporal tactile features and reinforcement learning , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[25]  Sergey Levine,et al.  REPLAB: A Reproducible Low-Cost Arm Benchmark for Robotic Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[26]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[27]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[28]  Jan Peters,et al.  Stable reinforcement learning with autoencoders for tactile and visual data , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[29]  Yuval Tassa,et al.  Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.

[30]  James Bergstra,et al.  Setting up a Reinforcement Learning Task with a Real-World Robot , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[31]  Sergey Levine,et al.  REPLAB: A Reproducible Low-Cost Arm Benchmark Platform for Robotic Learning , 2019, ArXiv.

[32]  Nathan F. Lepora,et al.  Optimal Deep Learning for Robot Touch: Training Accurate Pose Models of 3D Surfaces and Edges , 2020, IEEE Robotics & Automation Magazine.

[33]  Chen Lu,et al.  Surface Following using Deep Reinforcement Learning and a GelSightTactile Sensor , 2019, ArXiv.

[34]  Bohan Wu,et al.  MAT: Multi-Fingered Adaptive Tactile Grasping via Deep Reinforcement Learning , 2019, CoRL.

[35]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[36]  Henry Zhu,et al.  ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots , 2019, CoRL.

[37]  Jonathan Rossiter,et al.  The TacTip Family: Soft Optical Tactile Sensors with 3D-Printed Biomimetic Morphologies , 2018, Soft robotics.

[38]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[39]  Sergey Levine,et al.  Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[40]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[41]  Andrew Owens,et al.  The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes? , 2017, CoRL.

[42]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[43]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.