Unsupervised surgical data alignment with application to automatic activity annotation

Robotic surgery and other minimally-invasive surgical techniques are an integral part of patient care, and readily yield large amounts of data. Surgical tool motion (kinematic data) contains information that is useful for assessment and education. Typically, assessment and education tools that rely upon the kinematic data require substantial manual processing such as activity annotations. The goal of this paper was to develop an automated method to align surgical recordings and assign activity annotations. We developed an approach based on unsupervised alignment to efficient annotate kinematic data for its constituent activity segments. Our method includes extracting non-linear features from the kinematic data using a stacked de-noising autoencoder, and using modified dynamic time warping to align the kinematic data from different trials of the study task. We combined alignment between a test and one or a small set of template trials (with prior manual annotations) with voting based on kernel density estimation to transfer labels from the template to the test trial. Our experiments on performance of this method using two datasets captured in the training laboratory demonstrate an accuracy of 72% to 94% for annotating activity segments within a surgical training task. Our findings are robust to data captured from several surgeons, and to deviations in activity from a canonical activity sequence.

[1]  Gregory D. Hager,et al.  Sparse Hidden Markov Models for Surgical Gesture Classification and Skill Evaluation , 2012, IPCAI.

[2]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[3]  Gregory D. Hager,et al.  Structure in surgical motion , 2010 .

[4]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[5]  Guizhong Liu,et al.  Biorthogonal frequency-varying modulated lapped transform , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[6]  Gregory D. Hager,et al.  Data-Derived Models for Segmentation with Application to Surgical Assessment and Training , 2009, MICCAI.

[7]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[8]  Gregory D. Hager,et al.  An Improved Model for Segmentation and Recognition of Fine-Grained Activities with Application to Surgical Training Tasks , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[9]  Gregory D. Hager,et al.  Task versus Subtask Surgical Skill Evaluation of Robotic Minimally Invasive Surgery , 2009, MICCAI.

[10]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[11]  Gregory D. Hager,et al.  Surgical gesture classification from video and kinematic data , 2013, Medical Image Anal..

[12]  René Vidal,et al.  Surgical Gesture Classification from Video Data , 2012, MICCAI.

[13]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[14]  Gregory D. Hager,et al.  String Motif-Based Description of Tool Motion for Detecting Skill and Gestures in Robotic Surgery , 2013, MICCAI.

[15]  René Vidal,et al.  Learning Shared , Discriminative Dictionaries for Surgical Gesture Segmentation and Classification , 2015 .

[16]  Gregory D. Hager,et al.  Surgical Gesture Segmentation and Recognition , 2013, MICCAI.

[17]  Meinard Müller,et al.  Dynamic Time Warping , 2008 .

[18]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[19]  Henry C. Lin,et al.  Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions , 2006, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[20]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.