Point detection through multi-instance deep heatmap regression for sutures in endoscopy

Mitral valve repair is a complex minimally invasive surgery of the heart valve. In this context, suture detection from endoscopic images is a highly relevant task that provides quantitative information to analyse suturing patterns, assess prosthetic configurations and produce augmented reality visualisations. Facial or anatomical landmark detection tasks typically contain a fixed number of landmarks, and use regression or fixed heatmap-based approaches to localize the landmarks. However in endoscopy, there are a varying number of sutures in every image, and the sutures may occur at any location in the annulus, as they are not semantically unique. In this work, we formulate the suture detection task as a multi-instance deep heatmap regression problem, to identify entry and exit points of sutures. We extend our previous work, and introduce the novel use of a 2D Gaussian layer followed by a differentiable 2D spatial Soft-Argmax layer to function as a local non-maximum suppression. We present extensive experiments with multiple heatmap distribution functions and two variants of the proposed model. In the intra-operative domain, Variant 1 showed a mean F1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_1$$\end{document} of +0.0422\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+ 0.0422$$\end{document} over the baseline. Similarly, in the simulator domain, Variant 1 showed a mean F1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_1$$\end{document} of +0.0865\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+ 0.0865$$\end{document} over the baseline. The proposed model shows an improvement over the baseline in the intra-operative and the simulator domains. The data is made publicly available within the scope of the MICCAI AdaptOR2021 Challenge https://adaptor2021.github.io/, and the code at https://github.com/Cardio-AI/suture-detection-pytorch/.

[1]  Sandy Engelhardt,et al.  Mutually Improved Endoscopic Image Synthesis and Landmark Detection in Unpaired Image-to-Image Translation , 2021, IEEE Journal of Biomedical and Health Informatics.

[2]  Pavlo Molchanov,et al.  Hand Pose Estimation via Latent 2.5D Heatmap Regression , 2018, ECCV.

[3]  Hans-Peter Meinzer,et al.  Endoscopic feature tracking for augmented-reality assisted prosthesis selection in mitral valve repair , 2016, SPIE Medical Imaging.

[4]  Dinggang Shen,et al.  Joint Craniomaxillofacial Bone Segmentation and Landmark Digitization by Context-Guided Fully Convolutional Networks , 2017, MICCAI.

[5]  Thierry Chateau,et al.  A survey of deep facial landmark detection , 2018 .

[6]  Christian Payer,et al.  Integrating spatial configuration into heatmap regression based CNNs for landmark localization , 2019, Medical Image Anal..

[7]  Gustavo Carneiro,et al.  Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support , 2017, Lecture Notes in Computer Science.

[8]  S. Duffner,et al.  A connexionist approach for robust and precise facial feature detection in complex scenes , 2005, ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005..

[9]  Fausto Milletari,et al.  Fully Convolutional Regression Network for Accurate Detection of Measurement Points , 2017, DLMIA/ML-CDS@MICCAI.

[10]  Lisa Tang,et al.  Deep Convolutional Encoder Networks for Multiple Sclerosis Lesion Segmentation , 2015, MICCAI.

[11]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[12]  Haoqiang Fan,et al.  Approaching human level facial landmark localization by deep learning , 2016, Image Vis. Comput..

[13]  Peter M. Full,et al.  Improving Surgical Training Phantoms by Hyperrealism: Deep Unpaired Image-to-Image Translation from Real Surgeries , 2018, MICCAI.

[14]  Simon K. Warfield,et al.  Asymmetric Loss Functions and Deep Densely-Connected Networks for Highly-Imbalanced Medical Image Segmentation: Application to Multiple Sclerosis Lesion Detection , 2018, IEEE Access.

[15]  S. Engelhardt,et al.  Replicated mitral valve models from real patients offer training opportunities for minimally invasive mitral valve repair. , 2019, Interactive cardiovascular and thoracic surgery.

[16]  Jun Zhang,et al.  Detecting Anatomical Landmarks From Limited Medical Imaging Data Using Two-Stage Task-Oriented Deep Neural Networks , 2017, IEEE Transactions on Image Processing.

[17]  Jorge Novo,et al.  Deep multi-instance heatmap regression for the detection of retinal vessel crossings and bifurcations in eye fundus images , 2019, Comput. Methods Programs Biomed..

[18]  Qingshan Liu,et al.  Stacked Hourglass Network for Robust Facial Landmark Localisation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Ivo Wolf,et al.  Cross-Domain Conditional Generative Adversarial Networks for Stereoscopic Hyperrealism in Surgical Training , 2019, MICCAI.

[20]  A Piwnica,et al.  A new reconstructive operation for correction of mitral and tricuspid insufficiency. , 1971, The Journal of thoracic and cardiovascular surgery.

[21]  Hans-Peter Meinzer,et al.  Augmented Reality-Enhanced Endoscopic Images for Annuloplasty Ring Sizing , 2014, AE-CAI.

[22]  Marek Kowalski,et al.  Deep Alignment Network: A Convolutional Neural Network for Robust Face Alignment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Raffaele De Simone,et al.  Heatmap-based 2D Landmark Detection with a Varying Number of Landmarks , 2021, ArXiv.

[24]  Peng Sun,et al.  Globally Tuned Cascade Pose Regression via Back Propagation with Application in 2D Face Pose Estimation and Heart Segmentation in 3D CT Images , 2015, ArXiv.

[25]  Thabo Beeler,et al.  Attention-Driven Cropping for Very High Resolution Facial Landmark Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Dmytro Mishkin,et al.  Kornia: an Open Source Differentiable Computer Vision Library for PyTorch , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[27]  Georgios Tzimiropoulos,et al.  Human Pose Estimation via Convolutional Part Heatmap Regression , 2016, ECCV.

[28]  F. Casselman,et al.  Mitral Valve Surgery Can Now Routinely Be Performed Endoscopically , 2003, Circulation.

[29]  S. Engelhardt,et al.  Domain gap in adapting self-supervised depth estimation methods for stereo-endoscopy , 2020 .

[30]  Bernhard Preim,et al.  Flexible and comprehensive patient-specific mitral valve silicone models with chordae tendineae made from 3D-printable molds , 2019, International Journal of Computer Assisted Radiology and Surgery.