Weakly Supervised Recognition of Surgical Gestures

Kinematic trajectories recorded from surgical robots contain information about surgical gestures and potentially encode cues about surgeon’s skill levels. Automatic segmentation of these trajectories into meaningful action units could help to develop new metrics for surgical skill assessment as well as to simplify surgical automation. State-of-the-art methods for action recognition relied on manual labelling of large datasets, which is time consuming and error prone. Unsupervised methods have been developed to overcome these limitations. However, they often rely on tedious parameter tuning and perform less well than supervised approaches, especially on data with high variability such as surgical trajectories. Hence, the potential of weak supervision could be to improve unsupervised learning while avoiding manual annotation of large datasets. In this paper, we used at a minimum one expert demonstration and its ground truth annotations to generate an appropriate initialization for a GMM-based algorithm for gesture recognition. We showed on real surgical demonstrations that the latter significantly outperforms standard task-agnostic initialization methods. We also demonstrated how to improve the recognition accuracy further by redefining the actions and optimising the inputs.

[1]  Gregory D. Hager,et al.  Task versus Subtask Surgical Skill Evaluation of Robotic Minimally Invasive Surgery , 2009, MICCAI.

[2]  Gregory D. Hager,et al.  Temporal Convolutional Networks: A Unified Approach to Action Segmentation , 2016, ECCV Workshops.

[3]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[4]  Nassir Navab,et al.  Statistical modeling and recognition of surgical workflow , 2012, Medical Image Anal..

[5]  Eric L. Sauser,et al.  An Approach Based on Hidden Markov Model and Gaussian Mixture Regression , 2010 .

[6]  Aman Behal,et al.  Learning from Demonstration: Generalization via Task Segmentation , 2017 .

[7]  Henry C. Lin,et al.  Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions , 2006, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[8]  Thomas S. Huang,et al.  Toward robust learning of the Gaussian mixture state emission densities for hidden Markov models , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Trevor Darrell,et al.  TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Neville Hogan,et al.  Avoiding Spurious Submovement Decompositions II: A Scattershot Algorithm , 2006, Biological Cybernetics.

[11]  Gregory D. Hager,et al.  Transition state clustering: Unsupervised surgical trajectory segmentation for robot learning , 2017, ISRR.

[12]  Germain Forestier,et al.  Unsupervised Trajectory Segmentation for Surgical Gesture Recognition in Robotic Training , 2016, IEEE Transactions on Biomedical Engineering.

[13]  René Vidal,et al.  End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Gregory D. Hager,et al.  Sparse Hidden Markov Models for Surgical Gesture Classification and Skill Evaluation , 2012, IPCAI.

[15]  Peter Kazanzides,et al.  An open-source research kit for the da Vinci® Surgical System , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Sergey Levine,et al.  Optimism-driven exploration for nonlinear systems , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Jan Peters,et al.  Learning movement primitive libraries through probabilistic segmentation , 2017, Int. J. Robotics Res..

[18]  Maja J. Mataric,et al.  Automated Derivation of Primitives for Movement Classification , 2000, Auton. Robots.

[19]  Henry C. Lin,et al.  JHU-ISI Gesture and Skill Assessment Working Set ( JIGSAWS ) : A Surgical Activity Dataset for Human Motion Modeling , 2014 .

[20]  Jason J. Corso,et al.  Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection , 2017, IEEE Transactions on Medical Imaging.

[21]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[22]  Ion Stoica,et al.  Multi-Level Discovery of Deep Options , 2017, ArXiv.

[23]  Constantinos Loukas,et al.  Surgical workflow analysis with Gaussian mixture multivariate autoregressive (GMMAR) models: a simulation study , 2013, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[24]  Gregory D. Hager,et al.  Learning convolutional action primitives for fine-grained action recognition , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[26]  Danail Stoyanov,et al.  Articulated Multi-Instrument 2-D Pose Estimation Using Fully Convolutional Networks , 2018, IEEE Transactions on Medical Imaging.

[27]  Shahram Payandeh,et al.  Task and Motion Analyses in Endoscopic Surgery , 1996, Dynamic Systems and Control.

[28]  Pierre Jannin,et al.  An Application-Dependent Framework for the Recognition of High-Level Surgical Tasks in the OR , 2011, MICCAI.

[29]  Gregory D. Hager,et al.  A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery , 2017, IEEE Transactions on Biomedical Engineering.

[30]  Gwénolé Quellec,et al.  Real-time recognition of surgical tasks in eye surgery videos , 2014, Medical Image Anal..

[31]  Johannes Blömer,et al.  Simple Methods for Initializing the EM Algorithm for Gaussian Mixture Models , 2013, ArXiv.

[32]  Gregory D. Hager,et al.  Surgical Gesture Segmentation and Recognition , 2013, MICCAI.

[33]  Charles Elkan,et al.  Expectation Maximization Algorithm , 2010, Encyclopedia of Machine Learning.

[34]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[35]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[36]  Ratna Babu Chinnam,et al.  Soft Boundary Approach for Unsupervised Gesture Segmentation in Robotic-Assisted Surgery , 2017, IEEE Robotics and Automation Letters.

[37]  Gregory D. Hager,et al.  Data-Derived Models for Segmentation with Application to Surgical Assessment and Training , 2009, MICCAI.

[38]  Sang Hyoung Lee,et al.  Autonomous framework for segmenting robot trajectories of manipulation task , 2015, Auton. Robots.

[39]  A. Darzi,et al.  Dexterity enhancement with robotic surgery , 2004, Surgical Endoscopy And Other Interventional Techniques.

[40]  D. Stoyanov,et al.  3-D Pose Estimation of Articulated Instruments in Robotic Minimally Invasive Surgery , 2018, IEEE Transactions on Medical Imaging.