Using convolutional features and a sparse autoencoder for land-use scene classification

ABSTRACT In this article, we propose a novel approach based on convolutional features and sparse autoencoder (AE) for scene-level land-use (LU) classification. This approach starts by generating an initial feature representation of the scenes under analysis from a deep convolutional neural network (CNN) pre-learned on a large amount of labelled data from an auxiliary domain. Then these convolutional features are fed as input to a sparse AE for learning a new suitable representation in an unsupervised manner. After this pre-training phase, we propose two different scenarios for building the classification system. In the first scenario, we add a softmax layer on the top of the AE encoding layer and then fine-tune the resulting network in a supervised manner using the target training images available at hand. Then we classify the test images based on the posterior probabilities provided by the softmax layer. In the second scenario, we view the classification problem from a reconstruction perspective. To this end we train several class-specific AEs (i.e. one AE per class) and then classify the test images based on the reconstruction error. Experimental results conducted on the University of California (UC) Merced and Banja-Luka LU public data sets confirm the superiority of the proposed approach compared to state-of-the-art methods.

[1]  Ivor W. Tsang,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Domain Adaptation from Multiple Sources: A Domain- , 2022 .

[2]  Baojun Zhao,et al.  Compressed-Domain Ship Detection on Spaceborne Optical Image Using Deep Neural Network and Extreme Learning Machine , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[3]  Yingli Tian,et al.  Pyramid of Spatial Relatons for Scene-Level Land Use Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Gang Wang,et al.  Deep Learning-Based Classification of Hyperspectral Data , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[5]  Vladimir Risojevic,et al.  Gabor Descriptors for Aerial Image Classification , 2011, ICANNGA.

[6]  Xin Huang,et al.  A multi-index learning approach for classification of high-resolution remotely sensed images over urban areas , 2014 .

[7]  Fei-Yue Wang,et al.  Traffic Flow Prediction With Big Data: A Deep Learning Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[8]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Fuqiang Chen,et al.  Subset based deep learning for RGB-D object recognition , 2015, Neurocomputing.

[10]  Gunnar Rätsch,et al.  An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis , 2008, NIPS.

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Jiwen Lu,et al.  Single Sample Face Recognition via Learning Deep Supervised Autoencoders , 2015, IEEE Transactions on Information Forensics and Security.

[13]  Junwei Han,et al.  Multi-class geospatial object detection and geographic image classification based on collection of part detectors , 2014 .

[14]  Mohammed Bennamoun,et al.  Deep Reconstruction Models for Image Set Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Uwe Stilla,et al.  Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[16]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[17]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[18]  Roland Memisevic,et al.  The Potential Energy of an Autoencoder , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[20]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[21]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[22]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[23]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[24]  Lei Guo,et al.  Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Brian P. Salmon,et al.  Multiview Deep Learning for Land-Use Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[26]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Xing Zhao,et al.  Spectral–Spatial Classification of Hyperspectral Data Based on Deep Belief Network , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[28]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[29]  Ping Tang,et al.  Land-Use Scene Classification Using a Concentric Circle-Structured Multiscale Bag-of-Visual-Words Model , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[30]  Roger C. Tam,et al.  Efficient Training of Convolutional Deep Belief Networks in the Frequency Domain for Application to High-Resolution 2D and 3D Images , 2015, Neural Computation.

[31]  Ivor W. Tsang,et al.  Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[33]  Shawn D. Newsam,et al.  Spatial pyramid co-occurrence for image classification , 2011, 2011 International Conference on Computer Vision.

[34]  Hongyi Liu,et al.  A New Pan-Sharpening Method With Deep Neural Networks , 2015, IEEE Geoscience and Remote Sensing Letters.

[35]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[36]  Bo Du,et al.  Scene Classification via a Gradient Boosting Random Convolutional Network Framework , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[37]  Lei Guo,et al.  Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[38]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[39]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[40]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[41]  Ivor W. Tsang,et al.  Domain Transfer Multiple Kernel Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Naif Alajlan,et al.  Land-Use Classification With Compressive Sensing Multifeature Fusion , 2015, IEEE Geoscience and Remote Sensing Letters.