Uncertainty-aware Instance Segmentation using Dropout Sampling

Vision is an integral part of many robotic systems, and especially so when a robot must interact with its environment. In such cases, decisions made based on erroneous visual detections can have disastrous consequences. Hence, being able to accurately measure the uncertainty associated with visual information is highly important for making informed decisions. However, this uncertainty is often not captured by classic computer vision systems or metrics. In this paper we address the task of instance segmentation in a robotics context, where we are concerned with uncertainty associated with not only the class of an object (semantic uncertainty) but also its location (spatial uncertainty). We apply dropout sampling to the state-of-the-art instance segmentation network Mask-RCNN to provide estimates of both semantic uncertainty and spatial uncertainty. We show that a metric which combines both uncertainty measures provides an estimate of uncertainty which improves over either one individually. Additionally, we apply our technique to the ACRV Probabilistic Object Detection dataset where it achieves a score of 14.65.

[1]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[2]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[3]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[4]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[5]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[8]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[9]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xavier Giró-i-Nieto,et al.  Cost-Effective Active Learning for Melanoma Segmentation , 2017, NIPS 2017.

[11]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[12]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[13]  Ian D. Reid,et al.  RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Peter Corke,et al.  Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach , 2018, Robotics: Science and Systems.

[15]  Peter I. Corke,et al.  Cartman: The Low-Cost Cartesian Manipulator that Won the Amazon Robotics Challenge , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Peter I. Corke,et al.  Probability-based Detection Quality (PDQ): A Probabilistic Approach to Detection Evaluation , 2018, ArXiv.

[17]  Peter I. Corke,et al.  Semantic Segmentation from Limited Training Data , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Jürgen Leitner,et al.  Zero-shot Sim-to-Real Transfer with Modular Priors , 2018, ArXiv.

[19]  Niko Sünderhauf,et al.  Dropout Sampling for Robust Object Detection in Open-Set Conditions , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Weichao Xu,et al.  Real-time object detection and semantic segmentation for autonomous driving , 2018, International Symposium on Multispectral Image Processing and Pattern Recognition.

[21]  Peter I. Corke,et al.  Multi-View Picking: Next-best-view Reaching for Improved Grasping in Clutter , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[22]  Michael Milford,et al.  Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[23]  Klaus C. J. Dietmayer,et al.  Deep Active Learning for Efficient Training of a LiDAR 3D Object Detector , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[24]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.