An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy

We present a comprehensive analysis of the submissions to the first edition of the Endoscopy Artefact Detection challenge (EAD). Using crowd-sourcing, this initiative is a step towards understanding the limitations of existing state-of-the-art computer vision methods applied to endoscopy and promoting the development of new approaches suitable for clinical translation. Endoscopy is a routine imaging technique for the detection, diagnosis and treatment of diseases in hollow-organs; the esophagus, stomach, colon, uterus and the bladder. However the nature of these organs prevent imaged tissues to be free of imaging artefacts such as bubbles, pixel saturation, organ specularity and debris, all of which pose substantial challenges for any quantitative analysis. Consequently, the potential for improved clinical outcomes through quantitative assessment of abnormal mucosal surface observed in endoscopy videos is presently not realized accurately. The EAD challenge promotes awareness of and addresses this key bottleneck problem by investigating methods that can accurately classify, localize and segment artefacts in endoscopy frames as critical prerequisite tasks. Using a diverse curated multi-institutional, multi-modality, multi-organ dataset of video frames, the accuracy and performance of 23 algorithms were objectively ranked for artefact detection and segmentation. The ability of methods to generalize to unseen datasets was also evaluated. The best performing methods (top 15%) propose deep learning strategies to reconcile variabilities in artefact appearance with respect to size, modality, occurrence and organ type. However, no single method outperformed across all tasks. Detailed analyses reveal the shortcomings of current training strategies and highlight the need for developing new optimal metrics to accurately quantify the clinical applicability of methods.

[1]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Max Q.-H. Meng,et al.  De-blurring wireless capsule endoscopy images by total variation minimization , 2011, Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[3]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[4]  Evon M. O. Abu-Taieh,et al.  Comparative Study , 2020, Definitions.

[5]  Sharib Ali,et al.  A deep learning framework for quality assessment and restoration in video endoscopy , 2019, Medical Image Anal..

[6]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[7]  S. Tchoulack,et al.  A video stream processor for real-time detection and correction of specular reflections in endoscopic images , 2008, 2008 Joint 6th International IEEE Northeast Workshop on Circuits and Systems and TAISA Conference.

[8]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Robert C. Wolpert,et al.  A Review of the , 1985 .

[10]  Adrien Bartoli,et al.  3D Reconstruction in Laparoscopy with Close-Range Photometric Stereo , 2012, MICCAI.

[11]  Thomas Martinetz,et al.  Deep convolutional neural networks as generic feature extractors , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[12]  P. Baldi,et al.  Deep Learning Localizes and Identifies Polyps in Real Time With 96% Accuracy in Screening Colonoscopy. , 2018, Gastroenterology.

[13]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[15]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[16]  Shinji Tanaka,et al.  PIT PATTERN DIAGNOSIS FOR COLORECTAL NEOPLASIA USING NARROW BAND IMAGING MAGNIFICATION , 2006 .

[17]  T. J. Terpstra,et al.  The asymptotic normality and consistency of kendall's test against trend, when ties are present in one ranking , 1952 .

[18]  Adrien Bartoli,et al.  Deep Multi-class Adversarial Specularity Removal , 2019, SCIA.

[19]  S. Dwivedi,et al.  Obesity May Be Bad: Compressed Convolutional Networks for Biomedical Image Segmentation , 2020 .

[20]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Trevor Darrell,et al.  Deep Layer Aggregation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[24]  Abhishek Dutta,et al.  The VGG Image Annotator (VIA) , 2019, ArXiv.

[25]  Apostolos-Paul N. Refenes,et al.  Review of Current Practice , 1999 .

[26]  Sharib Ali,et al.  Endoscopy Artefact Detection (EAD) Dataset , 2019 .

[27]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Alan C. Bovik,et al.  No-Reference Image Quality Assessment in the Spatial Domain , 2012, IEEE Transactions on Image Processing.

[29]  Jesús Chamorro-Martínez,et al.  Diatom autofocusing in brightfield microscopy: a comparative study , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[30]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[32]  Sharib Ali,et al.  Efficient Video Indexing for Monitoring Disease Activity and Progression in the Upper Gastrointestinal Tract , 2019, 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019).

[33]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[34]  A. R. Jonckheere,et al.  A DISTRIBUTION-FREE k-SAMPLE TEST AGAINST ORDERED ALTERNATIVES , 1954 .

[35]  Xindong Wu,et al.  Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Lin Yang,et al.  Suggestive Annotation: A Deep Active Learning Framework for Biomedical Image Segmentation , 2017, MICCAI.

[37]  Masahiro Yamaguchi,et al.  Appearance of enhanced tissue features in narrow-band endoscopic imaging. , 2004, Journal of biomedical optics.

[38]  Sharib Ali,et al.  Endoscopy artifact detection (EAD 2019) challenge dataset , 2019, ArXiv.

[39]  Yassine Ruichek,et al.  Survey on semantic segmentation using deep learning techniques , 2019, Neurocomputing.

[40]  Rajvinder Singh,et al.  Advanced endoscopic imaging in Barrett's oesophagus: a review on current practice. , 2011, World journal of gastroenterology.

[41]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[42]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[43]  Thomas Stehle,et al.  Removal of Specular Reflections in Endoscopic Images , 2006 .

[44]  Sharib Ali,et al.  Fast mosaicing of cystoscopic images from dense correspondence: Combined SURF and TV-L1 optical flow method , 2013, 2013 IEEE International Conference on Image Processing.

[45]  Michael Riegler,et al.  KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection , 2017, MMSys.