Learning Deep Structured Active Contours End-to-End

The world is covered with millions of buildings, and precisely knowing each instance's position and extents is vital to a multitude of applications. Recently, automated building footprint segmentation models have shown superior detection accuracy thanks to the usage of Convolutional Neural Networks (CNN). However, even the latest evolutions struggle to precisely delineating borders, which often leads to geometric distortions and inadvertent fusion of adjacent building instances. We propose to overcome this issue by exploiting the distinct geometric properties of buildings. To this end, we present Deep Structured Active Contours (DSAC), a novel framework that integrates priors and constraints into the segmentation process, such as continuous boundaries, smooth edges, and sharp corners. To do so, DSAC employs Active Contour Models (ACM), a family of constraint- and prior-based polygonal models. We learn ACM parameterizations per instance using a CNN, and show how to incorporate all components in a structured output model, making DSAC trainable end-to-end. We evaluate DSAC on three challenging building instance segmentation datasets, where it compares favorably against state-of-the-art. Code will be made available on https://github.com/dmarcosg/DSAC.

[1]  Andrew McCallum,et al.  Structured Prediction Energy Networks , 2015, ICML.

[2]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Min Bai,et al.  Deep Watershed Transform for Instance Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Alan L. Yuille,et al.  Learning Deep Structured Models , 2014, ICML.

[5]  Peng Wang,et al.  Semantic Instance Segmentation via Deep Metric Learning , 2017, ArXiv.

[6]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[8]  Konrad Schindler,et al.  SEMANTIC SEGMENTATION OF AERIAL IMAGES IN URBAN AREAS WITH CLASS-SPECIFIC HIGHER-ORDER CLIQUES , 2015 .

[9]  Sanja Fidler,et al.  Annotating Object Instances with a Polygon-RNN , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Oliver Wang,et al.  A Bayesian Approach to Building Footprint Extraction from Aerial LIDAR Data , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[11]  Min Bai,et al.  TorontoCity: Seeing the World with a Million Eyes , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Philip H. S. Torr,et al.  Recurrent Instance Segmentation , 2015, ECCV.

[13]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[14]  Raquel Urtasun,et al.  Fully Connected Deep Structured Networks , 2015, ArXiv.

[15]  Laurent D. Cohen,et al.  On active contour models and balloons , 1991, CVGIP Image Underst..

[16]  Carsten Rother,et al.  Convexity Shape Constraints for Image Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Nassir Navab,et al.  Deep Active Contours , 2016, ArXiv.

[18]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[21]  Pascal Fua,et al.  Free-Shape Polygonal Object Localization , 2014, ECCV.

[22]  Liora Sahar,et al.  Using Aerial Imagery and GIS in Automated Building Footprint Extraction and Shape Recognition for Earthquake Risk Assessment of Urban Inventories , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Anthony J. Yezzi,et al.  Gradient flows and geometric active contour models , 1995, Proceedings of IEEE International Conference on Computer Vision.

[24]  Florian Siegert,et al.  Earth observation in support of malaria control and epidemiology: MALAREO monitoring approaches. , 2015, Geospatial health.

[25]  Thomas Hofmann,et al.  Learning Aerial Image Segmentation From Online Maps , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[26]  BrooksRheannon,et al.  Semi-Automated Building Footprint Extraction From Orthophotos , 2015 .

[27]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[28]  Sanja Fidler,et al.  Instance-Level Segmentation for Autonomous Driving with Deep Densely Connected MRFs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Steve R. Gunn,et al.  A Robust Snake Implementation; A Dual Active Contour , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[31]  Qihao Weng,et al.  Population Estimation of Urban Residential Communities Using Remotely Sensed Morphologic Data , 2015, IEEE Geoscience and Remote Sensing Letters.

[32]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.