Automatic Gesture Recognition in Robot-assisted Surgery with Reinforcement Learning and Tree Search

Automatic surgical gesture recognition is fundamental for improving intelligence in robot-assisted surgery, such as conducting complicated tasks of surgery surveillance and skill evaluation. However, current methods treat each frame individually and produce the outcomes without effective consideration on future information. In this paper, we propose a framework based on reinforcement learning and tree search for joint surgical gesture segmentation and classification. An agent is trained to segment and classify the surgical video in a human-like manner whose direct decisions are re-considered by tree search appropriately. Our proposed tree search algorithm unites the outputs from two designed neural networks, i.e., policy and value network. With the integration of complementary information from distinct models, our framework is able to achieve the better performance than baseline methods using either of the neural networks. For an overall evaluation, our developed approach consistently outperforms the existing methods on the suturing task of JIGSAWS dataset in terms of accuracy, edit score and F1 score. Our study highlights the utilization of tree search to refine actions in reinforcement learning framework for surgical robotic applications.

[1]  Elena De Momi,et al.  Weakly Supervised Recognition of Surgical Gestures , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[2]  Daochang Liu,et al.  Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification , 2018, MICCAI.

[3]  Gregory D. Hager,et al.  Data-Derived Models for Segmentation with Application to Surgical Assessment and Training , 2009, MICCAI.

[4]  Sang Hyoung Lee,et al.  Autonomous framework for segmenting robot trajectories of manipulation task , 2015, Auton. Robots.

[5]  Gregory D. Hager,et al.  Temporal Convolutional Networks for Action Segmentation and Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Juergen Gall,et al.  Temporal Action Detection Using a Statistical Language Model , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Henry C. Lin,et al.  JHU-ISI Gesture and Skill Assessment Working Set ( JIGSAWS ) : A Surgical Activity Dataset for Human Motion Modeling , 2014 .

[8]  Pierre Jannin,et al.  Automatic data-driven real-time segmentation and recognition of surgical workflow , 2016, International Journal of Computer Assisted Radiology and Surgery.

[9]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[10]  Riccardo Muradore,et al.  A Cognitive Robot Control Architecture for Autonomous Execution of Surgical Tasks , 2016, J. Medical Robotics Res..

[11]  René Vidal,et al.  End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[12]  Ilana Nisky,et al.  Using Augmentation to Improve the Robustness to Rotation of Deep Learning Segmentation in Robotic-Assisted Surgical Data , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[13]  Danail Stoyanov,et al.  DeepPhase: Surgical Phase Recognition in CATARACTS Videos , 2018, MICCAI.

[14]  Gregory D. Hager,et al.  Surgical Gesture Segmentation and Recognition , 2013, MICCAI.

[15]  Ana Luisa Trejos,et al.  Analysis of Energy-Based Metrics for Laparoscopic Skills Assessment , 2018, IEEE Transactions on Biomedical Engineering.

[16]  Gregory D. Hager,et al.  Segmental Spatiotemporal CNNs for Fine-Grained Action Segmentation , 2016, ECCV.

[17]  Gregory D. Hager,et al.  Transition state clustering: Unsupervised surgical trajectory segmentation for robot learning , 2017, ISRR.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[20]  Gregory D. Hager,et al.  Temporal Convolutional Networks: A Unified Approach to Action Segmentation , 2016, ECCV Workshops.

[21]  Joo-Ho Lee,et al.  Phase Segmentation Methods for an Automatic Surgical Workflow Analysis , 2017, Int. J. Biomed. Imaging.

[22]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Chi-Wing Fu,et al.  SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network , 2018, IEEE Transactions on Medical Imaging.

[25]  Gregory D. Hager,et al.  Segmenting and classifying activities in robot-assisted surgery with recurrent neural networks , 2019, International Journal of Computer Assisted Radiology and Surgery.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Gregory D. Hager,et al.  Sparse Hidden Markov Models for Surgical Gesture Classification and Skill Evaluation , 2012, IPCAI.

[28]  Gregory D. Hager,et al.  A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery , 2017, IEEE Transactions on Biomedical Engineering.

[29]  Roland Memisevic,et al.  Modeling Deep Temporal Dependencies with Recurrent "Grammar Cells" , 2014, NIPS.

[30]  H. Jaap van den Herik,et al.  Single-Player Monte-Carlo Tree Search , 2008, Computers and Games.

[31]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[32]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[33]  Gregory D. Hager,et al.  Task versus Subtask Surgical Skill Evaluation of Robotic Minimally Invasive Surgery , 2009, MICCAI.

[34]  Gregory D. Hager,et al.  Recognizing Surgical Activities with Recurrent Neural Networks , 2016, MICCAI.

[35]  Yao Guo,et al.  Transfer Learning for Surgical Task Segmentation , 2019, 2019 International Conference on Robotics and Automation (ICRA).