Deep Level Set for Box-supervised Instance Segmentation in Aerial Images

Box-supervised instance segmentation has recently attracted lots of research efforts while little attention is received in aerial image domain. In contrast to the general object collections, aerial objects have large intra-class variances and inter-class similarity with complex background. Moreover, there are many tiny objects in the highresolution satellite images. This makes the recent pairwise affinity modeling method inevitably to involve the noisy supervision with the inferior results. To tackle these problems, we propose a novel aerial instance segmentation approach, which drives the network to learn a series of level set functions for the aerial objects with only box annotations in an end-to-end fashion. Instead of learning the pairwise affinity, the level set method with the carefully designed energy functions treats the object segmentation as curve evolution, which is able to accurately recover the object’s boundaries and prevent the interference from the indistinguishable background and similar objects. The experimental results demonstrate that the proposed approach outperforms the state-of-the-art box-supervised instance segmentation methods. The source code is available at https://github.com/LiWentomng/boxlevelset.

[1]  Wen Yang,et al.  Instance Segmentation with Oriented Proposals for Aerial Images , 2020, IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium.

[2]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Marios Savvides,et al.  Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation , 2017, IEEE Transactions on Image Processing.

[4]  Tao Kong,et al.  SOLOv2: Dynamic and Fast Instance Segmentation , 2020, NeurIPS.

[5]  Yue Zhang,et al.  SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[7]  Yung-Yu Chuang,et al.  Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior , 2019, NeurIPS.

[8]  Peng Wang,et al.  Multi-Scale Object Detection in Satellite Imagery Based On YOLT , 2019, IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium.

[9]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Xiao Xiang Zhu,et al.  Vehicle Instance Segmentation From Aerial Image and Video Using a Multitask Learning Residual Fully Convolutional Network , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[11]  J. Sethian,et al.  FRONTS PROPAGATING WITH CURVATURE DEPENDENT SPEED: ALGORITHMS BASED ON HAMILTON-JACOB1 FORMULATIONS , 2003 .

[12]  Gang Wang,et al.  Deep Level Sets for Salient Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jonathan T. Barron,et al.  Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Seunghyeon Kim,et al.  CNN-Based Semantic Segmentation Using Level Set Loss , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[15]  Zhouchen Lin,et al.  PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Bernt Schiele,et al.  Simple Does It: Weakly Supervised Instance and Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Gang Zhang,et al.  RefineMask: Towards High-Quality Instance Segmentation with Fine-Grained Features , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[19]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[20]  Junwei Han,et al.  Oriented R-CNN for Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[22]  Yang Long,et al.  Learning RoI Transformer for Oriented Object Detection in Aerial Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Tony F. Chan,et al.  Active contours without edges , 2001, IEEE Trans. Image Process..

[24]  Fei-Fei Li,et al.  What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[25]  Zhi Tian,et al.  BoxInst: High-Performance Instance Segmentation with Box Annotations , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Gui-Song Xia,et al.  ReDet: A Rotation-equivariant Detector for Aerial Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Chunhua Shen,et al.  PolarMask: Single Shot Instance Segmentation With Polar Representation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Sungroh Yoon,et al.  BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Yuwen Xiong,et al.  LevelSet R-CNN: A Deep Variational Method for Instance Segmentation , 2020, ECCV.

[32]  Hujun Bao,et al.  Deep Snake for Real-Time Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Zhuo Zheng,et al.  Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Resolution Remote Sensing Imagery , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Sanja Fidler,et al.  Object Instance Annotation With Deep Extreme Level Set Evolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Ling Shao,et al.  iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images , 2019, CVPR Workshops.

[36]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[37]  Chunhua Shen,et al.  Conditional Convolutions for Instance Segmentation , 2020, ECCV.

[38]  D. Mumford,et al.  Optimal approximations by piecewise smooth functions and associated variational problems , 1989 .

[39]  J. Sethian,et al.  A Fast Level Set Method for Propagating Interfaces , 1995 .

[40]  Yong Jae Lee,et al.  YOLACT: Real-Time Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Xinggang Wang,et al.  Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Shifeng Zhang,et al.  Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Ambrish Tyagi,et al.  Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation , 2020, ECCV.

[44]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[45]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  S. Osher,et al.  Algorithms Based on Hamilton-Jacobi Formulations , 1988 .

[47]  Jong Chul Ye,et al.  Mumford–Shah Loss Functional for Image Segmentation With Deep Learning , 2019, IEEE Transactions on Image Processing.

[48]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.