Can surgical simulation be used to train detection and classification of neural networks?

Computer-assisted interventions (CAI) aim to increase the effectiveness, precision and repeatability of procedures to improve surgical outcomes. The presence and motion of surgical tools is a key information input for CAI surgical phase recognition algorithms. Vision-based tool detection and recognition approaches are an attractive solution and can be designed to take advantage of the powerful deep learning paradigm that is rapidly advancing image recognition and classification. The challenge for such algorithms is the availability and quality of labelled data used for training. In this Letter, surgical simulation is used to train tool detection and segmentation based on deep convolutional neural networks and generative adversarial networks. The authors experiment with two network architectures for image segmentation in tool classes commonly encountered during cataract surgery. A commercially-available simulator is used to create a simulated cataract dataset for training models prior to performing transfer learning on real surgical data. To the best of authors’ knowledge, this is the first attempt to train deep learning models for surgical instrument detection on simulated data while demonstrating promising results to generalise on real data. Results indicate that simulated data does have some potential for training advanced classification methods for CAI systems.

[1]  Gregory D. Hager,et al.  Recognizing Surgical Activities with Recurrent Neural Networks , 2016, MICCAI.

[2]  Gregory D. Hager,et al.  Automatic Detection and Segmentation of Robot-Assisted Surgical Motions , 2005, MICCAI.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[5]  Nassir Navab,et al.  Surgical Tool Tracking and Pose Estimation in Retinal Microsurgery , 2015, MICCAI.

[6]  Sébastien Ourselin,et al.  Combined 2D and 3D tracking of surgical instruments for minimally invasive and robotic-assisted surgery , 2016, International Journal of Computer Assisted Radiology and Surgery.

[7]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[8]  W. Berry,et al.  Perspectives in quality: designing the WHO Surgical Safety Checklist. , 2010, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Anirban Mukhopadhyay,et al.  Addressing multi-label imbalance problem of surgical tool detection using CNN , 2017, International Journal of Computer Assisted Radiology and Surgery.

[11]  Gregory D. Hager,et al.  Surgical gesture classification from video and kinematic data , 2013, Medical Image Anal..

[12]  Rüdiger Dillmann,et al.  Knowledge-Driven Formalization of Laparoscopic Surgeries for Rule-Based Intraoperative Context-Aware Assistance , 2014, IPCAI.

[13]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[14]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Rajesh Aggarwal,et al.  Synchronized video and motion analysis for the assessment of procedures in the operating theater. , 2005, Archives of surgery.

[16]  Masaru Ishii,et al.  Objective Assessment of Surgical Technical Skill and Competency in the Operating Room. , 2017, Annual review of biomedical engineering.

[17]  Nicolai Schoch,et al.  Surgical Data Science: Enabling Next-Generation Surgery , 2017, ArXiv.

[18]  Nassir Navab,et al.  Statistical modeling and recognition of surgical workflow , 2012, Medical Image Anal..

[19]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  C. Gupte,et al.  Validating Touch Surgery™: A cognitive task simulation and rehearsal app for intramedullary femoral nailing. , 2015, Injury.

[22]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Danail Stoyanov,et al.  Vision‐based and marker‐less surgical tool detection and tracking: a review of the literature , 2017, Medical Image Anal..

[24]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[25]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .