Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection

We present an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on DCNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large windows (receptive fields). However, this success comes at a cost, since the associated loss of effecive spatial resolution washes out high-frequency details and leads to blurry object boundaries. Here, we propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class-boundaries explicit in the model, First, we construct a comparatively simple, memory-efficient model by adding boundary detection to the Segnet encoder-decoder architecture. Second, we also include boundary detection in FCN-type models and set up a high-end classifier ensemble. We show that boundary detection significantly improves semantic segmentation with CNNs. Our high-end ensemble achieves > 90% overall accuracy on the ISPRS Vaihingen benchmark.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Cristian Sminchisescu,et al.  Semantic Segmentation with , 2012 .

[3]  King-Sun Fu,et al.  Information processing of remotely sensed agricultural data , 1969 .

[4]  S.M. Harris,et al.  Information Processing , 1977, Nature.

[5]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[6]  Kai Yu,et al.  Very deep convolutional neural networks for LVCSR , 2015, INTERSPEECH.

[7]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[8]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[9]  Markus Gerke,et al.  Use of the stair vision library within the ISPRS 2D semantic labeling benchmark (Vaihingen) , 2014 .

[10]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[11]  Michael Kampffmeyer,et al.  Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[13]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Horst Bischof,et al.  Multispectral classification of Landsat-images using neural networks , 1992, IEEE Trans. Geosci. Remote. Sens..

[15]  Takayoshi Yamashita,et al.  Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks , 2016, IRIACV.

[16]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Amy Loutfi,et al.  Classification and Segmentation of Satellite Orthoimagery Using Convolutional Neural Networks , 2016, Remote. Sens..

[19]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[20]  John A. Richards,et al.  Remote Sensing Digital Image Analysis , 1986 .

[21]  Uwe Stilla,et al.  SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS , 2016 .

[22]  E. Baltsavias,et al.  A TEST OF AUTOMATIC ROAD EXTRACTION APPROACHES , 2006 .

[23]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ronan Collobert,et al.  Recurrent Convolutional Neural Networks for Scene Labeling , 2014, ICML.

[25]  한보형,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015 .

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Jamie Sherrah,et al.  Effective semantic pixel labelling with convolutional networks and Conditional Random Fields , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[28]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[29]  Marius Leordeanu,et al.  Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery , 2016, ArXiv.

[30]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jonathan T. Barron,et al.  Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Iasonas Kokkinos,et al.  Pushing the Boundaries of Boundary Detection using Deep Learning , 2015, ICLR 2016.

[33]  Bertrand Le Saux,et al.  Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks , 2016, ACCV.

[34]  Jianbo Shi,et al.  High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and Its Applications to High-Level Vision , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Piotr Tokarczyk,et al.  Features, Color Spaces, and Boosting: New Insights on Semantic Classification of Remote Sensing Images , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[36]  Honglak Lee,et al.  Object Contour Detection with a Fully Convolutional Encoder-Decoder Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Geoffrey E. Hinton,et al.  Learning to Detect Roads in High-Resolution Aerial Images , 2010, ECCV.

[38]  Zhuowen Tu,et al.  Deeply-Supervised Nets , 2014, AISTATS.

[39]  Ronan Collobert,et al.  Learning to Refine Object Segments , 2016, ECCV.

[40]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[41]  Alexei A. Efros,et al.  Segmenting Scenes by Matching Image Composites , 2009, NIPS.

[42]  Jamie Sherrah,et al.  Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery , 2016, ArXiv.

[43]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[44]  David Malmgren-Hansen,et al.  Convolutional neural networks for SAR image segmentation , 2015, 2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[45]  Xiao Xiang Zhu,et al.  Spatiotemporal scene interpretation of space videos via deep neural network and tracklet analysis , 2016, 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[46]  Pierre Alliez,et al.  High-Resolution Semantic Labeling with Convolutional Neural Networks , 2016 .

[47]  S. Franklin,et al.  Empirical relations between digital SPOT HRV and CASI spectral response and lodgepole pine (Pinus contorta) forest stand parameters , 1993 .

[48]  Jon Atli Benediktsson,et al.  Morphological Attribute Profiles for the Analysis of Very High Resolution Images , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[49]  Aykut Erdem FOR DEEP CONVOLUTIONAL NETWORKS , 2016 .

[50]  S. Barr,et al.  INFERRING URBAN LAND USE FROM SATELLITE SENSOR IMAGES USING KERNEL-BASED SPATIAL RECLASSIFICATION , 1996 .

[51]  L. Bottou,et al.  Deep Convolutional Networks for Scene Parsing , 2009 .