EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos

Deep learning techniques hold promise to develop dense topography reconstruction and pose estimation methods for endoscopic videos. However, currently available datasets do not support effective quantitative benchmarking. In this paper, we introduce a comprehensive endoscopic SLAM dataset consisting of 3D point cloud data for six porcine organs, capsule and standard endoscopy recordings as well as synthetically generated data. A Panda robotic arm, two commercially available capsule endoscopes, two conventional endoscopes with different camera properties, and two high precision 3D scanners were employed to collect data from 8 ex-vivo porcine gastrointestinal (GI)-tract organs. In total, 35 sub-datasets are provided with 6D pose ground truth for the ex-vivo part: 18 sub-dataset for colon, 12 sub-datasets for stomach and 5 sub-datasets for small intestine, while four of these contain polyp-mimicking elevations carried out by an expert gastroenterologist. Synthetic capsule endoscopy frames from GI-tract with both depth and pose annotations are included to facilitate the study of simulation-to-real transfer learning algorithms. Additionally, we propound Endo-SfMLearner, an unsupervised monocular depth and pose estimation method that combines residual networks with spatial attention module in order to dictate the network to focus on distinguishable and highly textured tissue regions. The proposed approach makes use of a brightness-aware photometric loss to improve the robustness under fast frame-to-frame illumination changes. To exemplify the use-case of the EndoSLAM dataset, the performance of Endo-SfMLearner is extensively compared with the state-of-the-art. The codes and the link for the dataset are publicly available at this https URL. A video demonstrating the experimental setup and procedure is accessible through this https URL.

[1]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Dimitris K. Iakovidis,et al.  Video-based measurements for wireless capsule endoscope tracking , 2014 .

[3]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[4]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  T. Yano,et al.  Vascular, polypoid, and other lesions of the small bowel. , 2009, Best practice & research. Clinical gastroenterology.

[6]  E. Redondo-Cerezo,et al.  Wireless capsule endoscopy: perspectives beyond gastrointestinal bleeding. , 2014, World journal of gastroenterology.

[7]  Weihua Li,et al.  An Effective Localization Method for Robotic Endoscopic Capsules Using Multiple Positron Emission Markers , 2014, IEEE Transactions on Robotics.

[8]  Sara Moccia,et al.  EndoAbS dataset: Endoscopic abdominal stereo image dataset for benchmarking 3D stereo reconstruction algorithms , 2018, The international journal of medical robotics + computer assisted surgery : MRCAS.

[9]  Faisal Mahmood,et al.  Deep learning and conditional random fields‐based depth estimation and topographical reconstruction from conventional endoscopy , 2017, Medical Image Anal..

[10]  Mubarak Shah,et al.  Shape from shading using linear approximation , 1994, Image Vis. Comput..

[11]  Howie Choset,et al.  Intelligent Surgical Robots with Situational Awareness , 2015 .

[12]  Roger Y. Tsai,et al.  A new technique for fully autonomous and efficient 3D robotics hand/eye calibration , 1988, IEEE Trans. Robotics Autom..

[13]  Guang-Zhong Yang,et al.  Real-Time Stereo Reconstruction in Robotically Assisted Minimally Invasive Surgery , 2010, MICCAI.

[14]  Mathias Lux,et al.  HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy , 2020, Scientific data.

[15]  Michael Riegler,et al.  KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection , 2017, MMSys.

[16]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[17]  Yasin Almalioglu,et al.  VR-Caps: A Virtual Environment for Capsule Endoscopy , 2020, ArXiv.

[18]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[19]  Aymeric Histace,et al.  Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer , 2014, International Journal of Computer Assisted Radiology and Surgery.

[20]  Russell H. Taylor,et al.  Self-supervised Learning for Dense Depth Estimation in Monocular Endoscopy , 2018, OR 2.0/CARE/CLIP/ISIC@MICCAI.

[21]  J. M. M. Montiel,et al.  Visual SLAM for Handheld Monocular Endoscope , 2014, IEEE Transactions on Medical Imaging.

[22]  Metin Sitti,et al.  A 5-D Localization Method for a Magnetically Manipulated Untethered Robot Using a 2-D Array of Hall-Effect Sensors , 2016, IEEE/ASME Transactions on Mechatronics.

[23]  Zhongliang Deng,et al.  MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network , 2018, ArXiv.

[24]  Faisal Mahmood,et al.  Unsupervised Reverse Domain Adaptation for Synthetic Medical Images via Adversarial Training , 2017, IEEE Transactions on Medical Imaging.

[25]  Elena De Momi,et al.  Learning-based classification of informative laryngoscopic frames , 2018, Comput. Methods Programs Biomed..

[26]  Ian D. Reid,et al.  Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  T. Shah,et al.  Development of a Tracking Algorithm for an In-Vivo RF Capsule Prototype , 2006, 2006 International Conference on Electrical and Computer Engineering.

[28]  Ann G Zauber,et al.  Colorectal cancer screening: Estimated future colonoscopy need and current volume and capacity , 2016, Cancer.

[29]  Hann Woei Ho,et al.  Distance and velocity estimation using optical flow from a monocular camera , 2017 .

[30]  Chunxiao Fan,et al.  Visual odometry based on convolutional neural networks for large-scale scenes , 2019, International Conference on Graphic and Image Processing.

[31]  Dimitris K. Iakovidis,et al.  KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes , 2017, Endoscopy International Open.

[32]  Guang-Zhong Yang,et al.  Online tracking and retargeting with applications to optical biopsy in gastrointestinal endoscopic examinations , 2016, Medical Image Anal..

[33]  P. Dario,et al.  Frontiers of robotic endoscopic capsules: a review , 2016, Journal of Micro-Bio Robotics.

[34]  Baoyuan Wu,et al.  Unsupervised Multi-View Constrained Convolutional Network for Accurate Depth Estimation , 2020, IEEE Transactions on Image Processing.

[35]  Helder Araújo,et al.  A fully dense and globally consistent 3D map reconstruction approach for GI tract to enhance therapeutic relevance of the endoscopic capsule robot , 2017, ArXiv.

[36]  Edward L. Giovannucci,et al.  Global Burden of 5 Major Types Of Gastrointestinal Cancer. , 2020, Gastroenterology.

[37]  Aymeric Histace,et al.  Comparative Validation of Polyp Detection Methods in Video Colonoscopy: Results From the MICCAI 2015 Endoscopic Vision Challenge , 2017, IEEE Transactions on Medical Imaging.

[38]  Zhichao Yin,et al.  GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[40]  Marc Pollefeys,et al.  Real-time velocity estimation based on optical flow and disparity matching , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[41]  Guang-Zhong Yang,et al.  Self-Supervised Siamese Learning on Stereo Image Pairs for Depth Estimation in Robotic Surgery , 2017, ArXiv.

[42]  Yasin Almalioglu,et al.  Unsupervised Odometry and Depth Learning for Endoscopic Capsule Robots , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[43]  Helder Araújo,et al.  Deep EndoVO: A recurrent convolutional neural network (RCNN) based visual odometry approach for endoscopic capsule robots , 2017, Neurocomputing.

[44]  Nima Tajbakhsh,et al.  Automated Polyp Detection in Colonoscopy Videos Using Shape and Context Information , 2016, IEEE Transactions on Medical Imaging.

[45]  Won Ho Kim,et al.  Comparison of the diagnostic yield of "MiroCam" and "PillCam SB" capsule endoscopy. , 2012, Hepato-gastroenterology.

[46]  Nick Barnes,et al.  A Robust Docking Strategy for a Mobile Robot Using Flow Field Divergence , 2008, IEEE Trans. Robotics.

[47]  Gustavo Carneiro,et al.  Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue , 2016, ECCV.

[48]  Guoyu Lu,et al.  Deep Unsupervised Learning for Simultaneous Visual Odometry and Depth Estimation , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[49]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[50]  Fernando Vilariño,et al.  WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians , 2015, Comput. Medical Imaging Graph..

[51]  Daniel Mirota,et al.  A system for video-based navigation for endoscopic endonasal skull base surgery. , 2012, IEEE transactions on medical imaging.

[52]  Fernando Vilariño,et al.  Towards automatic polyp detection with a polyp appearance model , 2012, Pattern Recognit..

[53]  Michael J. Black,et al.  Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Richard J. Chen,et al.  SLAM Endoscopy enhanced by adversarial depth prediction , 2019, ArXiv.

[55]  Guang-Zhong Yang,et al.  Simultaneous Stereoscope Localization and Soft-Tissue Mapping for Minimal Invasive Surgery , 2006, MICCAI.

[56]  Lisandro J. Puglisi On the Velocity and Acceleration Estimation from Discrete Time-Position Signal of Linear Encoders , 2015 .

[57]  Guangjun Liu,et al.  On velocity estimation using position measurements , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[58]  Weihua Li,et al.  A Review of Localization Systems for Robotic Endoscopic Capsules , 2012, IEEE Transactions on Biomedical Engineering.

[59]  Nilanjan Dey,et al.  Wireless Capsule Gastrointestinal Endoscopy: Direction-of-Arrival Estimation Based Localization Survey , 2017, IEEE Reviews in Biomedical Engineering.