Towards Real-Time Detection of Squamous Pre-Cancers from Oesophageal Endoscopic Videos

This study investigates the feasibility of applying state of the art deep learning techniques to detect precancerous stages of squamous cell carcinoma (SCC) cancer in real time to address the challenges while diagnosing SCC with subtle appearance changes as well as video processing speed. Two deep learning models are implemented, which are to determine artefact of video frames and to detect, segment and classify those no-artefact frames respectively. For detection of SCC, both mask-RCNN and YOLOv3 architectures are implemented. In addition, in order to ascertain one bounding box being detected for one region of interest instead of multiple duplicated boxes, a faster non-maxima suppression technique (NMS) is applied on top of predictions. As a result, this developed system can process videos at 16-20 frames per second. Three classes are classified, which are 'suspicious', 'high grade' and 'cancer' of SCC. With the resolution of 1920x1080 pixels of videos, the average processing time while apply YOLOv3 is in the range of 0.064-0.101 seconds per frame, i.e. 10-15 frames per second, while running under Windows 10 operating system with 1 GPU (GeForce GTX 1060). The averaged accuracies for classification and detection are 85% and 74% respectively. Since YOLOv3 only provides bounding boxes, to delineate lesioned regions, mask-RCNN is also evaluated. While better detection result is achieved with 77% accuracy, the classification accuracy is similar to that by YOLOYv3 with 84%. However, the processing speed is more than 10 times slower with an average of 1.2 second per frame due to creation of masks. The accuracy of segmentation by mask-RCNN is 63%. These results are based on the date sets of 350 images. Further improvement is hence in need in the future by collecting, annotating or augmenting more datasets.

[1]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  M. Asaka,et al.  Long-term outcome after endoscopic mucosal resection in patients with esophageal squamous cell carcinoma invading the muscularis mucosae or deeper. , 2002, Gastrointestinal endoscopy.

[3]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  S Ourselin,et al.  Artificial intelligence for the real-time classification of intrapapillary capillary loop patterns in the endoscopic diagnosis of early oesophageal squamous cell carcinoma: A proof-of-concept study , 2019, United European gastroenterology journal.

[6]  P. Trivedi,et al.  Indications, stains and techniques in chromoendoscopy. , 2013, QJM : monthly journal of the Association of Physicians.

[7]  J. Ferlay,et al.  Global incidence of oesophageal cancer by histological subtype in 2012 , 2014, Gut.

[8]  A. Jemal,et al.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , 2018, CA: a cancer journal for clinicians.

[9]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[10]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  J. Luketich,et al.  Oesophageal carcinoma , 2013, The Lancet.

[12]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  F. Bray,et al.  Predicting the Future Burden of Esophageal Cancer by Histological Subtype: International Trends in Incidence up to 2030 , 2017, The American Journal of Gastroenterology.

[14]  N. Dubrawsky Cancer statistics , 1989, CA: a cancer journal for clinicians.

[15]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[16]  S. Zinger,et al.  Computer-aided detection of early neoplastic lesions in Barrett’s esophagus , 2016, Endoscopy.

[17]  Masatsugu Shiba,et al.  Usefulness of Non-Magnifying Narrow-Band Imaging in Screening of Early Esophageal Squamous Cell Carcinoma: A Prospective Comparative Study Using Propensity Score Matching , 2014, The American Journal of Gastroenterology.

[18]  Yu Qian,et al.  Modelling of chromatic contrast for retrieval of wallpaper images , 2015 .

[19]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.