Semantic segmentation of multisensor remote sensing imagery with deep ConvNets and higher-order conditional random fields

Abstract. Aerial images acquired by multiple sensors provide comprehensive and diverse information of materials and objects within a surveyed area. The current use of pretrained deep convolutional neural networks (DCNNs) is usually constrained to three-band images (i.e., RGB) obtained from a single optical sensor. Additional spectral bands from a multiple sensor setup introduce challenges for the use of DCNN. We fuse the RGB feature information obtained from a deep learning framework with light detection and ranging (LiDAR) features to obtain semantic labeling. Specifically, we propose a decision-level multisensor fusion technique for semantic labeling of the very-high-resolution optical imagery and LiDAR data. Our approach first obtains initial probabilistic predictions from two different sources: one from a pretrained neural network fine-tuned on a three-band optical image, and another from a probabilistic classifier trained on LiDAR data. These two predictions are then combined as the unary potential using a higher-order conditional random field (CRF) framework, which resolves fusion ambiguities by exploiting the spatial–contextual information. We utilize graph cut to efficiently infer the final semantic labeling for our proposed higher-order CRF framework. Experiments performed on three benchmarking multisensor datasets demonstrate the performance advantages of our proposed method.

[1]  Pierre Alliez,et al.  High-Resolution Semantic Labeling with Convolutional Neural Networks , 2016 .

[2]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[3]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Luisa Verdoliva,et al.  Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.

[5]  Uwe Stilla,et al.  SEMANTIC SEGMENTATION OF AERIAL IMAGES WITH AN ENSEMBLE OF CNNS , 2016 .

[6]  James E. Fowler,et al.  Decision Fusion in Kernel-Induced Spaces for Hyperspectral Image Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Jon Atli Benediktsson,et al.  Fusion of Support Vector Machines for Classification of Multisensor Data , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Jamie Sherrah,et al.  Effective semantic pixel labelling with convolutional networks and Conditional Random Fields , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Bertrand Le Saux,et al.  Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks , 2016, ACCV.

[10]  Jan Dirk Wegner,et al.  A Higher-Order CRF Model for Road Network Extraction , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Pushmeet Kohli,et al.  Associative Hierarchical Random Fields , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Pramod K. Varshney,et al.  An Image Fusion Approach Based on Markov Random Fields , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Jon Atli Benediktsson,et al.  Neural Network Approaches Versus Statistical Methods in Classification of Multisource Remote Sensing Data , 1989, 12th Canadian Symposium on Remote Sensing Geoscience and Remote Sensing Symposium,.

[15]  Sildomar T. Monteiro,et al.  Dense Semantic Labeling of Very-High-Resolution Aerial Imagery and LiDAR with Fully-Convolutional Neural Networks and Higher-Order CRFs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Sudhanshu Sekhar Panda,et al.  Application of Vegetation Indices for Agricultural Crop Yield Prediction Using Neural Network Techniques , 2010, Remote. Sens..

[18]  Thomas Mauthner,et al.  Semantic Classification in Aerial Imagery by Integrating Appearance and Height Information , 2009, ACCV.

[19]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[21]  Anil K. Jain,et al.  A Markov random field model for classification of multisource satellite imagery , 1996, IEEE Trans. Geosci. Remote. Sens..

[22]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Carlo Gatta,et al.  Unsupervised Deep Feature Extraction for Remote Sensing Image Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[24]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[25]  Gabriele Moser,et al.  Multimodal Classification of Remote Sensing Images: A Review and Future Directions , 2015, Proceedings of the IEEE.

[26]  Joost van de Weijer,et al.  Harmony Potentials , 2011, International Journal of Computer Vision.

[27]  Eli Saber,et al.  Survey of contemporary trends in color image segmentation , 2012, J. Electronic Imaging.

[28]  Ronald Kemker,et al.  Deep Neural Networks for Semantic Segmentation of Multispectral Remote Sensing Imagery , 2017, ArXiv.

[29]  D. Roberts,et al.  Urban tree species mapping using hyperspectral and lidar data fusion , 2014 .

[30]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Pushmeet Kohli,et al.  Graph Cut Based Inference with Co-occurrence Statistics , 2010, ECCV.

[32]  Alexandre Boulch,et al.  Processing of Extremely High-Resolution LiDAR and RGB Data: Outcome of the 2015 IEEE GRSS Data Fusion Contest–Part A: 2-D Contest , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[33]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[34]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Jon Atli Benediktsson,et al.  Land-Cover Mapping by Markov Modeling of Spatial–Contextual Information in Very-High-Resolution Remote Sensing Images , 2013, Proceedings of the IEEE.

[36]  N. Pettorelli,et al.  Using the satellite-derived NDVI to assess ecological responses to environmental change. , 2005, Trends in ecology & evolution.

[37]  Geoffrey E. Hinton,et al.  Learning to Detect Roads in High-Resolution Aerial Images , 2010, ECCV.

[38]  Jamie Sherrah,et al.  Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery , 2016, ArXiv.

[39]  Mark Q. Shaw,et al.  Automatic Image Segmentation by Dynamic Region Growth and Multiresolution Merging , 2009, IEEE Transactions on Image Processing.

[40]  Bernt Schiele,et al.  Disparity statistics for pedestrian detection: combining appearance, motion and stereo , 2010, ECCV 2010.

[41]  Uwe Stilla,et al.  Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[42]  Xiaopeng Zhang,et al.  Robust Rooftop Extraction From Visible Band Images Using Higher Order CRF , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[43]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Javier Marcello,et al.  Evaluation of Spatial and Spectral Effectiveness of Pixel-Level Fusion Techniques , 2013, IEEE Geoscience and Remote Sensing Letters.

[45]  Qian Du,et al.  Hyperspectral and LiDAR Data Fusion: Outcome of the 2013 GRSS Data Fusion Contest , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[46]  Muhammad Faisal Khan,et al.  Segmentation and Classification Using Logistic Regression in Remote Sensing Imagery , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[47]  Justin Domke,et al.  Learning Graphical Model Parameters with Approximate Marginal Inference , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Bertrand Le Saux,et al.  Beyond RGB: Very High Resolution Urban Remote Sensing With Multimodal Deep Networks , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[49]  Pamela L. Nagler,et al.  High Spatial Resolution WorldView-2 Imagery for Mapping NDVI and Its Relationship to Temporal Urban Landscape Evapotranspiration Factors , 2014, Remote. Sens..

[50]  Christine Pohl,et al.  Multisensor image fusion in remote sensing: concepts, methods and applications , 1998 .