Improving Temporal Stability and Accuracy for Endoscopic Video Tissue Classification Using Recurrent Neural Networks

Early Barrett’s neoplasia are often missed due to subtle visual features and inexperience of the non-expert endoscopist with such lesions. While promising results have been reported on the automated detection of this type of early cancer in still endoscopic images, video-based detection using the temporal domain is still open. The temporally stable nature of video data in endoscopic examinations enables to develop a framework that can diagnose the imaged tissue class over time, thereby yielding a more robust and improved model for spatial predictions. We show that the introduction of Recurrent Neural Network nodes offers a more stable and accurate model for tissue classification, compared to classification on individual images. We have developed a customized Resnet18 feature extractor with four types of classifiers: Fully Connected (FC), Fully Connected with an averaging filter (FC Avg (n = 5)), Long Short Term Memory (LSTM) and a Gated Recurrent Unit (GRU). Experimental results are based on 82 pullback videos of the esophagus with 46 high-grade dysplasia patients. Our results demonstrate that the LSTM classifier outperforms the FC, FC Avg (n = 5) and GRU classifier with an average accuracy of 85.9% compared to 82.2%, 83.0% and 85.6%, respectively. The benefit of our novel implementation for endoscopic tissue classification is the inclusion of spatio-temporal information for improved and robust decision making, and it is the first step towards full temporal learning of esophageal cancer detection in endoscopic video.

[1]  Xiaohong W. Gao,et al.  Transfer Learning For Endoscopy Disease Detection & Segmentation With Mask-RCNN Benchmark Architecture , 2020, EndoCV@ISBI.

[2]  Nicolas Chapados,et al.  Real-time differentiation of adenomatous and hyperplastic diminutive colorectal polyps during analysis of unaltered videos of standard colonoscopy using a deep learning model , 2017, Gut.

[3]  Mubashir Husain Rehmani,et al.  A survey of feature extraction and fusion of deep learning for detection of abnormalities in video endoscopy of gastrointestinal-tract , 2019, Artificial Intelligence Review.

[4]  T. Mahmood,et al.  Artificial Intelligence-Based Classification of Multiple Gastrointestinal Diseases Using Endoscopy Videos for Clinical Diagnosis , 2019, Journal of clinical medicine.

[5]  Xujiong Ye,et al.  Learning Spatiotemporal Features for Esophageal Abnormality Detection From Endoscopic Videos , 2020, IEEE Journal of Biomedical and Health Informatics.

[6]  D. Giordano,et al.  An AI-based Framework for Supporting Large Scale Automated Analysis of Video Capsule Endoscopy , 2019, 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[7]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Hao Chen,et al.  Integrating Online and Offline Three-Dimensional Deep Learning for Automated Polyp Detection in Colonoscopy Videos , 2017, IEEE Journal of Biomedical and Health Informatics.

[9]  Shuai Wang,et al.  Scalable gastroscopic video summarization via similar-inhibition dictionary selection , 2016, Artif. Intell. Medicine.

[10]  Stefan Schinkinger,et al.  Optical deformability as an inherent cell marker for testing malignant transformation and metastatic competence. , 2005, Biophysical journal.

[11]  A. Meining,et al.  The Argos project: The development of a computer-aided detection system to improve detection of Barrett's neoplasia on white light endoscopy , 2019, United European gastroenterology journal.

[12]  Joost van der Putten,et al.  Endoscopy-Driven Pretraining for Classification of Dysplasia in Barrett’s Esophagus with Endoscopic Narrow-Band Imaging Zoom Videos , 2020, Applied Sciences.

[13]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Andrew Q. Ninh,et al.  Artificial intelligence using convolutional neural networks for real-time detection of early esophageal neoplasia in Barrett's esophagus (with video). , 2020, Gastrointestinal endoscopy.

[15]  Sharib Ali,et al.  A deep learning framework for quality assessment and restoration in video endoscopy , 2019, Medical Image Anal..

[16]  Seiichi Uchida,et al.  Endoscopic Image Clustering with Temporal Ordering Information Based on Dynamic Programming* , 2019, 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[17]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[18]  Tao Lei,et al.  Action Recognition with 3D ConvNet-GRU Architecture , 2018, ICRCA '18.

[19]  Peter H. N. de With,et al.  First steps into endoscopic video analysis for Barrett’s cancer detection: challenges and opportunities , 2020, Medical Imaging.

[20]  Joost van der Putten,et al.  Informative Frame Classification of Endoscopic Videos Using Convolutional Neural Networks and Hidden Markov Models , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[21]  Bing Zeng,et al.  Review on the Applications of Deep Learning in the Analysis of Gastrointestinal Endoscopy Images , 2019, IEEE Access.

[22]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Hao Chen,et al.  Integrating Online and Offline Three-Dimensional Deep Learning for Automated Polyp Detection in Colonoscopy Videos. , 2017, IEEE journal of biomedical and health informatics.