论文信息 - PolyWorld: Polygonal Building Extraction with Graph Neural Networks in Satellite Images

PolyWorld: Polygonal Building Extraction with Graph Neural Networks in Satellite Images

Most state-of-the-art instance segmentation methods produce binary segmentation masks, however, geographic and cartographic applications typically require precise vector polygons of extracted objects instead of rasterized output. This paper introduces PolyWorld, a neural network that directly extracts building vertices from an image and connects them correctly to create precise polygons. The model predicts the connection strength between each pair of vertices using a graph neural network and estimates the assignments by solving a differentiable optimal transport problem. Moreover, the vertex positions are optimized by minimizing a combined segmentation and polygonal angle difference loss. PolyWorld significantly outperforms the state-of-the-art in building polygonization and achieves not only notable quantitative results, but also produces visually pleasing building polygons. Code and trained weights will be soon available on github.

Friedrich Fraundorfer | Stefan Habenschuss | Shabab Bazrafkan | Stefano Zorzi

[1] Yuri Boykov,et al. Normalized Cut Loss for Weakly-Supervised CNN Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2] Richard Sinkhorn,et al. Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[3] Sanja Fidler,et al. Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++ , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Marco Cuturi,et al. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[5] Min Bai,et al. Learning Deep Structured Active Contours End-to-End , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6] Yoonseok Jwa,et al. AN IMPLICIT REGULARIZATION FOR 3D BUILDING ROOFTOP MODELING USING AIRBORNE LIDAR DATA , 2012 .

[7] Sergey I. Nikolenko,et al. Building Detection from Satellite Imagery Using a Composite Loss Function , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8] Shu Liu,et al. Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9] Friedrich Fraundorfer,et al. Machine-learned Regularization and Polygonization of Building Segmentation Masks , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[10] Matthew B. Blaschko,et al. The Lovasz-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Ismail Ben Ayed,et al. On Regularized Losses for Weakly-supervised CNN Segmentation , 2018, ECCV.

[12] Tomasz Malisiewicz,et al. SuperGlue: Learning Feature Matching With Graph Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Alexey Shvets,et al. TernausNetV2: Fully Convolutional Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] William E. Lorensen,et al. Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[17] Pierre Alliez,et al. Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[18] David H. Douglas,et al. ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[19] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[21] Friedrich Fraundorfer,et al. Regularization of Building Boundaries in Satellite Images Using Adversarial and Regularized Losses , 2019, IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium.

[22] Sanja Fidler,et al. DARNet: Deep Active Ray Network for Building Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Sanja Fidler,et al. Annotating Object Instances with a Polygon-RNN , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Yang Wang,et al. Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation , 2016, ISVC.

[25] C. Fraser,et al. Automatic Detection of Residential Buildings Using LIDAR Data and Multispectral Imagery , 2010 .

[26] Marco Cuturi,et al. Computational Optimal Transport: With Applications to Data Science , 2019 .

[27] Justin Solomon,et al. Polygonal Building Extraction by Frame Field Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Sébastien Ourselin,et al. Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations , 2017, DLMIA/ML-CDS@MICCAI.

[29] Yuwen Xiong,et al. PolyTransform: Deep Polygon Transformer for Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] I. Dowman,et al. Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction * , 2007 .

[31] J. Munkres. ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[32] Vincent Lepetit,et al. MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans , 2021, ArXiv.

[33] Yifan Wu,et al. Quantization in Relative Gradient Angle Domain For Building Polygon Estimation , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).

[34] Md Zahangir Alom,et al. Recurrent residual U-Net for medical image segmentation , 2019, Journal of medical imaging.

[35] Jaewook Jung,et al. Building Extraction from Satellite Images Using Mask R-CNN with Building Boundary Regularization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[36] Sanja Fidler,et al. Fast Interactive Object Annotation With Curve-GCN , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Alexey Shvets,et al. TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation , 2018, Computer-Aided Analysis of Gastrointestinal Videos.

[38] Shuhei Hikosaka,et al. Building Detection from Satellite Imagery using Ensemble of Size-Specific Detectors , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[39] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[40] Jan Dirk Wegner,et al. Topological Map Extraction From Overhead Images , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).