EndoMapper dataset of complete calibrated endoscopy procedures

Computer-assisted systems are becoming broadly used in medicine. In endoscopy, most research focuses on automatic detection of polyps or other pathologies, but localization and navigation of the endoscope is completely performed manually by physicians. To broaden this research and bring spatial Artificial Intelligence to endoscopies, data from complete procedures are needed. This data will be used to build a 3D mapping and localization systems that can perform special task like, for example, detect blind zones during exploration, provide automatic polyp measurements, guide doctors to a polyp found in a previous exploration and retrieve previous images of the same area aligning them for easy comparison. These systems will provide an improvement in the quality and precision of the procedures while lowering the burden on the physicians. This paper introduces the Endomapper dataset, the first collection of complete endoscopy sequences acquired during regular medical practice, including slow and careful screening explorations, making secondary use of medical data. Its original purpose is to facilitate the development and evaluation of VSLAM (Visual Simultaneous Localization and Mapping) methods in real endoscopy data. The first release of the dataset is composed of 59 sequences with more than 15 hours of video. It is also the first endoscopic dataset that includes both the computed geometric and photometric endoscope calibration with the original calibration videos. Meta-data and annotations associated to the dataset varies from anatomical landmark and description of the procedure labeling, tools segmentation masks, COLMAP 3D reconstructions, simulated sequences with groundtruth and meta-data related to special cases, such as sequences from the same patient. This information will improve the research in endoscopic VSLAM, as well as other research lines, and create new research lines.

[1]  Jan-Michael Frahm,et al.  Colon10k: A Benchmark For Place Recognition In Colonoscopy , 2021, 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI).

[2]  M. Pollefeys,et al.  Back to the Feature: Learning Robust Camera Localization from Pixels to Pose , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yasin Almalioglu,et al.  VR-Caps: A Virtual Environment for Capsule Endoscopy , 2020, Medical Image Anal..

[4]  Carlos Campos,et al.  ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM , 2020, IEEE Transactions on Robotics.

[5]  Peter M. Full,et al.  Heidelberg colorectal data set for surgical data science in the sensor operating room , 2020, Scientific Data.

[6]  Jorge Bernal,et al.  Polyp Detection in Colonoscopy Videos , 2021, Computer-Aided Analysis of Gastrointestinal Videos.

[7]  Self-supervised visual place recognition for colonoscopy sequences , 2021 .

[8]  Yang Hao,et al.  Photometric Stereo-Based Depth Map Reconstruction for Monocular Capsule Endoscopy , 2020, Sensors.

[9]  Mathias Lux,et al.  HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy , 2020, Scientific data.

[10]  Xiaohong W. Gao,et al.  An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy , 2020, Scientific Reports.

[11]  Tsun-Yi Yang,et al.  UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision , 2020, ArXiv.

[12]  Thomas de Lange,et al.  Kvasir-SEG: A Segmented Polyp Dataset , 2019, MMM.

[13]  Torsten Sattler,et al.  D2-Net: A Trainable CNN for Joint Description and Detection of Local Features , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Giorgos Tolias,et al.  Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Daniel Cremers,et al.  The Double Sphere Camera Model , 2018, 2018 International Conference on 3D Vision (3DV).

[16]  Zhengqi Li,et al.  MegaDepth: Learning Single-View Depth Prediction from Internet Photos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Michael Riegler,et al.  Nerthus: A Bowel Preparation Quality Video Dataset , 2017, MMSys.

[19]  Michael Riegler,et al.  KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection , 2017, MMSys.

[20]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[21]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  G. Fernández-Esparrach,et al.  Exploring the clinical potential of an automatic colonic polyp detection method based on the creation of energy maps , 2016, Endoscopy.

[23]  Juho Kannala,et al.  A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Anastasis A. Sofokleous,et al.  Review: H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia , 2005, Comput. J..