Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and Progress Prediction

Surgical gesture recognition is important for surgical data science and computer-aided intervention. Even with robotic kinematic information, automatically segmenting surgical steps presents numerous challenges because surgical demonstrations are characterized by high variability in style, duration and order of actions. In order to extract discriminative features from the kinematic signals and boost recognition accuracy, we propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress. To show the effectiveness of the presented approach, we evaluate its application on the JIGSAWS dataset, that is currently the only publicly available dataset for surgical gesture recognition featuring robot kinematic data. We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.

[1]  Gregory D. Hager,et al.  Sparse Hidden Markov Models for Surgical Gesture Classification and Skill Evaluation , 2012, IPCAI.

[2]  Gregory D. Hager,et al.  Surgical Gesture Segmentation and Recognition , 2013, MICCAI.

[3]  Narges Ahmidi,et al.  Analysis of the Structure of Surgical Activity for a Suturing and Knot-Tying Task , 2016, PloS one.

[4]  Sinisa Todorovic,et al.  Temporal Deformable Residual Networks for Action Segmentation in Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Ivan Marsic,et al.  Progress Estimation and Phase Detection for Sequential Processes , 2017, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[6]  Germain Forestier,et al.  Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training , 2016, IEEE Transactions on Biomedical Engineering.

[7]  René Vidal,et al.  End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[8]  Russell H. Taylor,et al.  Surgical data science for next-generation interventions , 2017, Nature Biomedical Engineering.

[9]  Gregory D. Hager,et al.  Temporal Convolutional Networks for Action Segmentation and Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Gregory D. Hager,et al.  A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery , 2017, IEEE Transactions on Biomedical Engineering.

[11]  Gregory D. Hager,et al.  Segmental Spatiotemporal CNNs for Fine-Grained Action Segmentation , 2016, ECCV.

[12]  Gianluca Pollastri,et al.  A neural network approach to ordinal regression , 2007, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[13]  Rüdiger Dillmann,et al.  Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis , 2017, ArXiv.

[14]  Zhao Chen,et al.  GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[15]  Chenliang Xu,et al.  TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation , 2017, ArXiv.

[16]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[17]  Shahram Payandeh,et al.  Task and Motion Analyses in Endoscopic Surgery , 1996, Dynamic Systems and Control.

[18]  Gregory D. Hager,et al.  An Improved Model for Segmentation and Recognition of Fine-Grained Activities with Application to Surgical Training Tasks , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[19]  Gregory D. Hager,et al.  Transition state clustering: Unsupervised surgical trajectory segmentation for robot learning , 2017, ISRR.

[20]  Gregory D. Hager,et al.  Motion generation of robotic surgical tasks: Learning from expert demonstrations , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[21]  Gregory D. Hager,et al.  Temporal Convolutional Networks: A Unified Approach to Action Segmentation , 2016, ECCV Workshops.

[22]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[23]  Jason J. Corso,et al.  Joint Surgical Gesture and Task Classification with Multi-Task and Multimodal Learning , 2018, ArXiv.

[24]  Daochang Liu,et al.  Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification , 2018, MICCAI.

[25]  Annan Li,et al.  Atrous Temporal Convolutional Network for Video Action Segmentation , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[26]  Gregory D. Hager,et al.  Task versus Subtask Surgical Skill Evaluation of Robotic Minimally Invasive Surgery , 2009, MICCAI.

[27]  Gregory D. Hager,et al.  Recognizing Surgical Activities with Recurrent Neural Networks , 2016, MICCAI.

[28]  Sebastian Bodenstedt,et al.  Using 3D Convolutional Neural Networks to Learn Spatiotemporal Features for Automatic Surgical Gesture Recognition in Video , 2019, MICCAI.

[29]  Gaurav Yengera,et al.  Less is More: Surgical Phase Recognition with Less Annotations through Self-Supervised Pre-training of CNN-LSTM Networks , 2018, ArXiv.

[30]  Elena De Momi,et al.  Weakly Supervised Recognition of Surgical Gestures , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[31]  Pieter Abbeel,et al.  Learning by observation for surgical subtasks: Multilateral cutting of 3D viscoelastic and 2D Orthotropic Tissue Phantoms , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[32]  John Kenneth Salisbury,et al.  The Intuitive/sup TM/ telesurgery system: overview and application , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[33]  Ratna Babu Chinnam,et al.  Soft Boundary Approach for Unsupervised Gesture Segmentation in Robotic-Assisted Surgery , 2017, IEEE Robotics and Automation Letters.

[34]  Gregory D. Hager,et al.  Data-Derived Models for Segmentation with Application to Surgical Assessment and Training , 2009, MICCAI.

[35]  René Vidal,et al.  Learning Shared , Discriminative Dictionaries for Surgical Gesture Segmentation and Classification , 2015 .

[36]  Henry C. Lin,et al.  JHU-ISI Gesture and Skill Assessment Working Set ( JIGSAWS ) : A Surgical Activity Dataset for Human Motion Modeling , 2014 .