Learning Motion Flows for Semi-supervised Instrument Segmentation from Robotic Surgical Video

Performing low hertz labeling for surgical videos at intervals can greatly releases the burden of surgeons. In this paper, we study the semi-supervised instrument segmentation from robotic surgical videos with sparse annotations. Unlike most previous methods using unlabeled frames individually, we propose a dual motion based method to wisely learn motion flows for segmentation enhancement by leveraging temporal dynamics. We firstly design a flow predictor to derive the motion for jointly propagating the frame-label pairs given the current labeled frame. Considering the fast instrument motion, we further introduce a flow compensator to estimate intermediate motion within continuous frames, with a novel cycle learning strategy. By exploiting generated data pairs, our framework can recover and even enhance temporal consistency of training sequences to benefit segmentation. We validate our framework with binary, part, and type tasks on 2017 MICCAI EndoVis Robotic Instrument Segmentation Challenge dataset. Results show that our method outperforms the state-of-the-art semi-supervised methods by a large margin, and even exceeds fully supervised training on two tasks.

[1]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[2]  Shawn D. Newsam,et al.  Improving Semantic Segmentation via Video Propagation and Label Relaxation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ben Glocker,et al.  Semi-supervised Learning for Network-Based Cardiac MR Image Segmentation , 2017, MICCAI.

[4]  Jan Kautz,et al.  Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Lena Maier-Hein,et al.  2017 Robotic Instrument Segmentation Challenge , 2019, ArXiv.

[6]  Jon Barker,et al.  SDC-Net: Video Prediction Using Spatially-Displaced Convolution , 2018, ECCV.

[7]  Chi-Wing Fu,et al.  Uncertainty-aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation , 2019, MICCAI.

[8]  Sébastien Ourselin,et al.  ToolNet: Holistically-nested real-time segmentation of robotic surgical tools , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[10]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[13]  Alexander Rakhlin,et al.  Automatic Instrument Segmentation in Robot-Assisted Surgery Using Deep Learning , 2018, bioRxiv.

[14]  Jan Kautz,et al.  Unsupervised Video Interpolation Using Cycle Consistency , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Lena Maier-Hein,et al.  Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation , 2019, MICCAI.

[16]  Klaus H. Maier-Hein,et al.  Exploiting the potential of unlabeled endoscopic video data with self-supervised learning , 2017, International Journal of Computer Assisted Radiology and Surgery.

[17]  Pheng-Ann Heng,et al.  Incorporating Temporal Prior from Motion Flow for Instrument Segmentation in Minimally Invasive Surgery Video , 2019, MICCAI.

[18]  Danail Stoyanov,et al.  Patch-based adaptive weighting with segmentation and scale (PAWSS) for visual tracking , 2017, ArXiv.

[19]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Lin Yang,et al.  Deep Adversarial Networks for Biomedical Image Segmentation Utilizing Unannotated Images , 2017, MICCAI.

[21]  Pascal Fua,et al.  Simultaneous Recognition and Pose Estimation of Instruments in Minimally Invasive Surgery , 2017, MICCAI.

[22]  Blake Hannaford,et al.  Surgical Instrument Segmentation for Endoscopic Vision with Data Fusion of rediction and Kinematic Pose , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[23]  Nassir Navab,et al.  CFCM: Segmentation via Coarse to Fine Context Memory , 2018, MICCAI.

[24]  Nicolas Padoy,et al.  Self-Supervised Surgical Tool Segmentation using Kinematic Information , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[25]  Danail Stoyanov,et al.  More unlabelled data or label more data? A study on semi-supervised laparoscopic image segmentation , 2019, DART/MIL3ID@MICCAI.

[26]  Danail Stoyanov,et al.  EasyLabels: weak labels for scene segmentation in laparoscopic videos , 2019, International Journal of Computer Assisted Radiology and Surgery.