CaDIS: Cataract dataset for surgical RGB-image segmentation

Video feedback provides a wealth of information about surgical procedures and is the main sensory cue for surgeons. Scene understanding is crucial to computer assisted interventions (CAI) and to post-operative analysis of the surgical procedure. A fundamental building block of such capabilities is the identification and localization of surgical instruments and anatomical structures through semantic segmentation. Deep learning has advanced semantic segmentation techniques in the recent years but is inherently reliant on the availability of labelled datasets for model training. This paper introduces a dataset for semantic segmentation of cataract surgery videos complementing the publicly available CATARACTS challenge dataset. In addition, we benchmark the performance of several state-of-the-art deep learning models for semantic segmentation on the presented dataset. The dataset is publicly available at https://cataracts-semantic-segmentation2020.grand-challenge.org/.

[1]  Sébastien Ourselin,et al.  Image Based Surgical Instrument Pose Estimation with Multi-class Labelling and Optical Flow , 2015, MICCAI.

[2]  Satoshi Kondo,et al.  CATARACTS: Challenge on automatic tool annotation for cataRACT surgery , 2019, Medical Image Anal..

[3]  Yang Zhao,et al.  Deep High-Resolution Representation Learning for Visual Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Leo Joskowicz,et al.  Haptic computer-assisted patient-specific preoperative planning for orthopedic fractures surgery , 2015, International Journal of Computer Assisted Radiology and Surgery.

[6]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[7]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[8]  Zhanglin Wu,et al.  A combination of three-dimensional printing and computer-assisted virtual surgical procedure for preoperative planning of acetabular fracture reduction. , 2016, Injury.

[9]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[11]  Russell H. Taylor,et al.  Data-Driven Visual Tracking in Retinal Microsurgery , 2012, MICCAI.

[12]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[13]  Judy E. Kim,et al.  Medical malpractice claims related to cataract surgery complicated by retained lens fragments (an American Ophthalmological Society thesis). , 2012, Transactions of the American Ophthalmological Society.

[14]  Danail Stoyanov,et al.  Surgical robotics beyond enhanced dexterity instrumentation: a survey of machine learning techniques and their role in intelligent and autonomous surgical actions , 2016, International Journal of Computer Assisted Radiology and Surgery.

[15]  G. Marchal,et al.  Image segmentation: methods and applications in diagnostic radiology and nuclear medicine. , 1993, European journal of radiology.

[16]  Sébastien Ourselin,et al.  Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking , 2016, CARE@MICCAI.

[17]  Nicolai Schoch,et al.  Surgical Data Science: Enabling Next-Generation Surgery , 2017, ArXiv.

[18]  Håkon Olav Leira,et al.  Semantic segmentation and detection of mediastinal lymph nodes and anatomical structures in CT data for lung cancer staging , 2019, International Journal of Computer Assisted Radiology and Surgery.

[19]  Masaru Ishii,et al.  Objective Assessment of Surgical Technical Skill and Competency in the Operating Room. , 2017, Annual review of biomedical engineering.

[20]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[21]  Nazneen Nazm,et al.  Posterior capsular rent: Prevention and management , 2017, Indian journal of ophthalmology.

[22]  Orcun Goksel,et al.  Extending pretrained segmentation networks with additional anatomical structures. , 2018 .

[23]  Danail Stoyanov,et al.  DeepPhase: Surgical Phase Recognition in CATARACTS Videos , 2018, MICCAI.

[24]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[25]  Danail Stoyanov,et al.  EasyLabels: weak labels for scene segmentation in laparoscopic videos , 2019, International Journal of Computer Assisted Radiology and Surgery.

[26]  Stefanie Speidel,et al.  Learning soft tissue behavior of organs for surgical navigation with convolutional neural networks , 2019, International Journal of Computer Assisted Radiology and Surgery.

[27]  Peter M. Full,et al.  Improving Surgical Training Phantoms by Hyperrealism: Deep Unpaired Image-to-Image Translation from Real Surgeries , 2018, MICCAI.

[28]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Yuning Jiang,et al.  Unified Perceptual Parsing for Scene Understanding , 2018, ECCV.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Lena Maier-Hein,et al.  BIAS: Transparent reporting of biomedical image analysis challenges , 2019, Medical Image Analysis.