Open-Sourced Reinforcement Learning Environments for Surgical Robotics

Reinforcement Learning (RL) is a machine learning framework for artificially intelligent systems to solve a variety of complex problems. Recent years has seen a surge of successes solving challenging games and smaller domain problems, including simple though non-specific robotic manipulation and grasping tasks. Rapid successes in RL have come in part due to the strong collaborative effort by the RL community to work on common, open-sourced environment simulators such as OpenAI's Gym that allow for expedited development and valid comparisons between different, state-of-art strategies. In this paper, we aim to bridge the RL and the surgical robotics communities by presenting the first open-sourced reinforcement learning environments for surgical robotics, called dVRL. Through the proposed RL environment, which are functionally equivalent to Gym, we show that it is easy to prototype and implement state-of-art RL algorithms on surgical robotics problems that aim to introduce autonomous robotic precision and accuracy to assisting, collaborative, or repetitive tasks during surgery. Learned policies are furthermore successfully transferable to a real robot. Finally, combining dVRL with the over 40+ international network of da Vinci Surgical Research Kits in active use at academic institutions, we see dVRL as enabling the broad surgical robotics community to fully leverage the newest strategies in reinforcement learning, and for reinforcement learning scientists with no knowledge of surgical robotics to test and develop new algorithms that can solve the real-world, high-impact challenges in autonomous surgery.

[1]  Bruno Siciliano,et al.  A V-REP Simulator for the da Vinci Research Kit Robotic Platform , 2018, 2018 7th IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob).

[2]  Florian Richter,et al.  Motion Scaling Solutions for Improved Performance in High Delay Surgical Teleoperation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[3]  Pieter Abbeel,et al.  Autonomous multilateral debridement with the Raven surgical robot , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Sanjay Krishnan,et al.  Learning 2D Surgical Camera Motion From Demonstrations , 2018, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).

[5]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[6]  Zerui Wang,et al.  A Robust Data-Driven Approach for Online Learning and Manipulation of Unmodeled 3-D Heterogeneous Compliant Objects , 2018, IEEE Robotics and Automation Letters.

[7]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[8]  Andrew J. Davison,et al.  Sim-to-Real Reinforcement Learning for Deformable Object Manipulation , 2018, CoRL.

[9]  Mamoru Mitsuishi,et al.  Online Trajectory Planning in Dynamic Environments for Surgical Task Automation , 2014, Robotics: Science and Systems.

[10]  Heinz Wörn,et al.  An intelligent and autonomous endoscopic guidance system for minimally invasive surgery , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[12]  Pieter Abbeel,et al.  Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations , 2010, 2010 IEEE International Conference on Robotics and Automation.

[13]  Yan Wang,et al.  A Convex Optimization-Based Dynamic Model Identification Package for the da Vinci Research Kit , 2019, IEEE Robotics and Automation Letters.

[14]  Pieter Abbeel,et al.  Learning accurate kinematic control of cable-driven surgical robots using data cleaning and Gaussian Process Regression , 2014, 2014 IEEE International Conference on Automation Science and Engineering (CASE).

[15]  Yun-Hui Liu,et al.  Dual-Arm Robotic Needle Insertion With Active Tissue Deformation for Autonomous Suturing , 2019, IEEE Robotics and Automation Letters.

[16]  Pieter Abbeel,et al.  Learning by observation for surgical subtasks: Multilateral cutting of 3D viscoelastic and 2D Orthotropic Tissue Phantoms , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Andrew J. Davison,et al.  PyRep: Bringing V-REP to Deep Robot Learning , 2019, ArXiv.

[18]  Jason Jianjun Gu,et al.  Modular Design of Neurosurgical robotic System , 2018, Int. J. Robotics Autom..

[19]  John N. Tsitsiklis,et al.  Actor-Critic Algorithms , 1999, NIPS.

[20]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[21]  Wojciech Zaremba,et al.  Domain Randomization and Generative Models for Robotic Grasping , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  John F. Canny,et al.  Fast and Reliable Autonomous Surgical Debridement with Cable-Driven Robots Using a Two-Phase Calibration Procedure , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[24]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[25]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[26]  Bruno Siciliano,et al.  Modelling and identification of the da Vinci Research Kit robotic arms , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Masayoshi Tomizuka,et al.  Robotic manipulation of deformable objects by tangent space mapping and non-rigid registration , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Marcin Andrychowicz,et al.  Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Elena De Momi,et al.  Automated Pick-Up of Suturing Needles for Robotic Surgical Assistance , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Shane Legg,et al.  Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.

[31]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[32]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  Brijen Thananjeyan,et al.  Multilateral surgical pattern cutting in 2D orthotropic gauze with deep reinforcement learning policies for tensioning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Peter Kazanzides,et al.  An open-source research kit for the da Vinci® Surgical System , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Anthony Jarc,et al.  Development and Validation of Objective Performance Metrics for Robot‐Assisted Radical Prostatectomy: A Pilot Study , 2018, The Journal of urology.

[36]  Marcin Andrychowicz,et al.  Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.

[37]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[38]  Michael C. Yip,et al.  Robot Autonomy for Surgery , 2017, The Encyclopedia of Medical Robotics.

[39]  Yifei Zhang,et al.  Augmented Reality Predictive Displays to Help Mitigate the Effects of Delayed Telesurgery , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[40]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.