CSVM Architectures for Pixel-Wise Object Detection in High-Resolution Remote Sensing Images

Detecting objects becomes an increasingly important task in very high resolution (VHR) remote sensing imagery analysis. With the development of GPU-computing capability, a growing number of deep convolutional neural networks (CNNs) have been designed to address the object detection challenge. However, compared with CPU, GPU is much more costly. Therefore, GPU-based methods are less attractive in practical applications. In this article, we propose a CPU-based method that is based on convolutional support vector machines (CSVMs) to address the object detection challenge in VHR images. Experiments are conducted on three VHR and two unmanned aerial vehicle (UAV) data sets with very limited training data. Results show that the proposed CSVM achieves competitive performance compared to U-Net which is an efficient CNN-based model designed for small training data sets.

[1]  Belinda A. Margono,et al.  Mapping and monitoring deforestation and forest degradation in Sumatra (Indonesia) using Landsat time series data sets from 1990 to 2010 , 2012 .

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Jams L. Cushnie The interactive effect of spatial resolution and degree of internal variability within land-cover types on classification accuracies , 1987 .

[4]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[5]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[6]  Weiqiang Wang,et al.  Pre-trained VGGNet Architecture for Remote-Sensing Image Scene Classification , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[7]  Alfred Stein,et al.  Deep Fully Convolutional Networks for the Detection of Informal Settlements in VHR Images , 2017, IEEE Geoscience and Remote Sensing Letters.

[8]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Nima Tajbakhsh,et al.  Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? , 2016, IEEE Transactions on Medical Imaging.

[10]  Uwe Stilla,et al.  Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[11]  Farid Melgani,et al.  Convolutional SVM Networks for Object Detection in UAV Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Xiao Xiang Zhu,et al.  HSF-Net: Multiscale Deep Feature Embedding for Ship Detection in Optical Remote Sensing Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Xuelong Li,et al.  Scene Classification With Recurrent Attention of VHR Remote Sensing Images , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Baoyuan Wu,et al.  Automatic Building Rooftop Extraction From Aerial Images via Hierarchical RGB-D Priors , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[15]  Jefersson Alex dos Santos,et al.  Towards better exploiting convolutional neural networks for remote sensing scene classification , 2016, Pattern Recognit..

[16]  Alan H. Strahler,et al.  Global land cover mapping from MODIS: algorithms and early results , 2002 .

[17]  L. Kaleschke,et al.  Exceptional melt pond occurrence in the years 2007 and 2011 on the Arctic sea ice revealed from MODIS satellite data , 2012 .

[18]  Xavier Blaes,et al.  Quantifying Fertilizer Application Response Variability with VHR Satellite NDVI Time Series in a Rainfed Smallholder Cropping System of Mali , 2016, Remote. Sens..

[19]  Lorenzo Bruzzone,et al.  Classification of hyperspectral remote sensing images with support vector machines , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Yongxin Yang,et al.  Frankenstein: Learning Deep Face Representations Using Small Data , 2016, IEEE Transactions on Image Processing.

[21]  Fei Chen,et al.  Assessing Impacts of Integrating MODIS Vegetation Data in the Weather Research and Forecasting (WRF) Model Coupled to Two Different Canopy-Resistance Approaches , 2014 .

[22]  Michele Volpi,et al.  Dense Semantic Labeling of Subdecimeter Resolution Images With Convolutional Neural Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[23]  W. Schroeder,et al.  Active fire detection using Landsat-8/OLI data , 2016 .

[24]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Alfred Stein,et al.  Recurrent Multiresolution Convolutional Networks for VHR Image Classification , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[27]  Nicholas C. Coops,et al.  Forest recovery trends derived from Landsat time series for North American boreal forests , 2016 .

[28]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[30]  Ke Yang,et al.  Performance Evaluation of Single-Label and Multi-Label Remote Sensing Image Retrieval Using a Dense Labeling Dataset , 2018, Remote. Sens..

[31]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Alexandre Carleer,et al.  Assessment of Very High Spatial Resolution Satellite Image Segmentations , 2005 .

[33]  Xueming Qian,et al.  Semantic Annotation of High-Resolution Satellite Images via Weakly Supervised Learning , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[34]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[35]  Raquel Urtasun,et al.  Understanding the Effective Receptive Field in Deep Convolutional Neural Networks , 2016, NIPS.

[36]  Alan Fern,et al.  Multi-object Tracking via Constrained Sequential Labeling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Mary Ann Fajvan,et al.  A Comparison of Multispectral and Multitemporal Information in High Spatial Resolution Imagery for Classification of Individual Tree Species in a Temperate Hardwood Forest , 2001 .

[38]  Jeffrey S. Vetter,et al.  A Survey of CPU-GPU Heterogeneous Computing Techniques , 2015, ACM Comput. Surv..

[39]  Quoc V. Le,et al.  Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Naif Alajlan,et al.  Deep Learning Approach for Car Detection in UAV Imagery , 2017, Remote. Sens..

[43]  Lorenzo Bruzzone,et al.  Multilabel Remote Sensing Image Retrieval Using a Semisupervised Graph-Theoretic Method , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[44]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Kim M. Hazelwood,et al.  Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.