Segment My Object: A Pipeline to Extract Segmented Objects in Images based on Labels or Bounding Boxes

We propose a pipeline (SegMyO – Segment my object) to automatically extract segmented objects in images based on given labels and / or bounding boxes. When providing the expected label, our system looks for the closest label in the list of outputs, using a measure of semantic similarity. And when providing the bounding box, it looks for the output object with the best coverage, based on several geometric criteria. Associated with a semantic segmentation model trained on a similar dataset, or a good region proposal algorithm, this pipeline provides a simple solution to segment efficiently a dataset without requiring specific training, but also to the problem of weakly-supervised segmentation. This is particularly useful to segment public datasets available with weak object annotations (e.g., bounding boxes and labels from a detection, labels from a caption) coming from an algorithm or from manual annotation. An experimental study conducted on the PASCAL VOC 2012 dataset shows that these simple criteria embedded in SegMyO allow to select the proposal with the best IoU score in most cases, and so to get the best of the pre-segmentation.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Olga Russakovsky,et al.  SpatialSense: An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Qiang Qiu,et al.  Weakly Supervised Instance Segmentation Using Class Peak Response , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Jonathan T. Barron,et al.  Multiscale Combinatorial Grouping , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Christoph H. Lampert,et al.  Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation , 2016, ECCV.

[7]  Antoni B. Chan,et al.  Adaptive figure-ground classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[9]  Benoit Huet,et al.  Semantic and Visual Similarities for Efficient Knowledge Transfer in CNN Training , 2019, 2019 International Conference on Content-Based Multimedia Indexing (CBMI).

[10]  Suha Kwak,et al.  Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yung-Yu Chuang,et al.  Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior , 2019, NeurIPS.

[12]  Xilin Chen,et al.  Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Matthieu Guillaumin,et al.  ImageNet Auto-Annotation with Segmentation Propagation , 2014, International Journal of Computer Vision.

[14]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[15]  Bernt Schiele,et al.  Simple Does It: Weakly Supervised Instance and Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[17]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Anton van den Hengel,et al.  Wider or Deeper: Revisiting the ResNet Model for Visual Recognition , 2016, Pattern Recognit..

[19]  Laurent Wendling,et al.  Force Banner for the recognition of spatial relations , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[20]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[21]  Ganggao Zhu,et al.  Computing Semantic Similarity of Concepts in Knowledge Graphs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[22]  Jian Sun,et al.  BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Toby Sharp,et al.  Image segmentation with a bounding box prior , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Paul Honeine,et al.  BB-UNet: U-Net With Bounding Box Prior , 2020, IEEE Journal of Selected Topics in Signal Processing.

[25]  Lizhuang Ma,et al.  Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).