ToolNet: Holistically-nested real-time segmentation of robotic surgical tools

Real-time tool segmentation from endoscopic videos is an essential part of many computer-assisted robotic surgical systems and of critical importance in robotic surgical data science. We propose two novel deep learning architectures for automatic segmentation of non-rigid surgical instruments. Both methods take advantage of automated deep-learning-based multi-scale feature extraction while trying to maintain an accurate segmentation quality at all resolutions. The two proposed methods encode the multi-scale constraint inside the network architecture. The first proposed architecture enforces it by cascaded aggregation of predictions and the second proposed network does it by means of a holistically-nested architecture where the loss at each scale is taken into account for the optimization process. As the proposed methods are for real-time semantic labeling, both present a reduced number of parameters. We propose the use of parametric rectified linear units for semantic labeling in these small architectures to increase the regularization of the network while maintaining the segmentation accuracy. We compare the proposed architectures against state-of-the-art fully convolutional networks. We validate our methods using existing benchmark datasets, including ex vivo cases with phantom tissue and different robotic surgical instruments present in the scene. Our results show a statistically significant improved Dice Similarity Coefficient over previous instrument segmentation methods. We analyze our design choices and discuss the key drivers for improving accuracy.

[1]  Sébastien Ourselin,et al.  A Continuum Robot and Control Interface for Surgical Assist in Fetoscopic Interventions , 2017, IEEE Robotics and Automation Letters.

[2]  Leslie N. Smith,et al.  No More Pesky Learning Rate Guessing Games , 2015, ArXiv.

[3]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Peter Kazanzides,et al.  An open-source research kit for the da Vinci® Surgical System , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Hanmin Lee,et al.  Fetal endoscopic surgery: lessons learned and trends reviewed. , 2002, Journal of pediatric surgery.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Masaru Ishii,et al.  Perspectives on Surgical Data Science , 2016, ArXiv.

[8]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[9]  Danail Stoyanov,et al.  Vision‐based and marker‐less surgical tool detection and tracking: a review of the literature , 2017, Medical Image Anal..

[10]  C Freschi,et al.  Technical review of the da Vinci surgical telemanipulator , 2013, The international journal of medical robotics + computer assisted surgery : MRCAS.

[11]  Nassir Navab,et al.  Deep Residual Learning for Instrument Segmentation in Robotic Surgery , 2017, MLMI@MICCAI.

[12]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[13]  M. Schijven,et al.  The value of haptic feedback in conventional and robot-assisted minimal invasive surgery and virtual reality training: a current review , 2009, Surgical Endoscopy.

[14]  Sébastien Ourselin,et al.  On the Compactness, Efficiency, and Representation of 3D Convolutional Networks: Brain Parcellation as a Pretext Task , 2017, IPMI.

[15]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Bernt Schiele,et al.  Detecting Surgical Tools by Modelling Local Appearance and Global Shape , 2015, IEEE Transactions on Medical Imaging.

[17]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[18]  Sébastien Ourselin,et al.  Toward Detection and Localization of Instruments in Minimally Invasive Surgery , 2013, IEEE Transactions on Biomedical Engineering.

[19]  Konstantinos Kamnitsas,et al.  Efficient multi‐scale 3D CNN with fully connected CRF for accurate brain lesion segmentation , 2016, Medical Image Anal..

[20]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[22]  Austin Reiter,et al.  A learning algorithm for visual pose estimation of continuum robots , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[24]  Adrien E. Desjardins,et al.  Fluidic actuation for intra-operative in situ imaging , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Sébastien Ourselin,et al.  Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking , 2016, CARE@MICCAI.