Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?

In this paper, we evaluate the generalization power of deep features (ConvNets) in two new scenarios: aerial and remote sensing image classification. We evaluate experimentally ConvNets trained for recognizing everyday objects for the classification of aerial and remote sensing images. ConvNets obtained the best results for aerial images, while for remote sensing, they performed well but were outperformed by low-level color descriptors, such as BIC. We also present a correlation analysis, showing the potential for combining/fusing different ConvNets with other descriptors or even for combining multiple ConvNets. A preliminary set of experiments fusing ConvNets obtains state-of-the-art results for the well-known UCMerced dataset.

[1]  Piotr Tokarczyk,et al.  Features, Color Spaces, and Boosting: New Insights on Semantic Classification of Remote Sensing Images , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Carlo Gatta,et al.  Unsupervised deep feature extraction of hyperspectral images , 2014, 2014 6th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS).

[3]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[4]  Sylvie Philipp-Foliguet,et al.  Efficient and Effective Hierarchical Feature Propagation , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[5]  Fatos T. Yarman-Vural,et al.  SASI: a generic texture descriptor for image retrieval , 2003, Pattern Recognit..

[6]  S TorresRicardo da,et al.  Comparative study of global color and texture descriptors for web image retrieval , 2012 .

[7]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[9]  Anil M. Cheriyadat,et al.  Unsupervised Feature Learning for Aerial Scene Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Gang Wang,et al.  Deep Learning-Based Classification of Hyperspectral Data , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[11]  Ricardo da Silva Torres,et al.  Visual word spatial arrangement for image retrieval and classification , 2014, Pattern Recognit..

[12]  Shuang Wang,et al.  Multilayer feature learning for polarimetric synthetic radar data classification , 2014, 2014 IEEE Geoscience and Remote Sensing Symposium.

[13]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Matthieu Cord,et al.  Pooling in image representation: The visual codeword point of view , 2013, Comput. Vis. Image Underst..

[16]  Mario A. Nascimento,et al.  A compact and efficient image retrieval approach based on border/interior pixel classification , 2002, CIKM '02.

[17]  Koen E. A. van de Sande,et al.  Empowering Visual Categorization With the GPU , 2011, IEEE Transactions on Multimedia.

[18]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[19]  S. Sachin Kumar,et al.  Deep Model for Classification of Hyperspectral image using Restricted Boltzmann Machine , 2014, ICONIAAC '14.

[20]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[21]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Ricardo da Silva Torres,et al.  Comparative study of global color and texture descriptors for web image retrieval , 2012, J. Vis. Commun. Image Represent..

[26]  Shawn D. Newsam,et al.  Geographic Image Retrieval Using Local Invariant Features , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[27]  Nicolas Courty,et al.  Multiclass feature learning for hyperspectral image classification: sparse and hierarchical solutions , 2015, ArXiv.

[28]  Zhe Xu,et al.  Feature Learning Based Approach for Weed Classification Using High Resolution Aerial Images from a Digital Camera Mounted on a UAV , 2014, Remote. Sens..

[29]  Shawn D. Newsam,et al.  Comparing SIFT descriptors and gabor texture features for classification of remote sensed imagery , 2008, 2008 15th IEEE International Conference on Image Processing.

[30]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  Christian Heipke,et al.  ARTIFICIAL NEURAL NETWORKS FOR THE DETECTION OF ROAD JUNCTIONS IN AERIAL IMAGES , 2003 .

[32]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[33]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[34]  Bo Tao,et al.  Texture Recognition and Image Retrieval Using Gradient Indexing , 2000, J. Vis. Commun. Image Represent..

[35]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[36]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Kamel Besbes,et al.  Comparison of local descriptors for automatic remote sensing image registration , 2015, Signal Image Video Process..

[38]  Anderson Rocha,et al.  Rank Aggregation for Pattern Classifier Selection in Remote Sensing Images , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[39]  Jefersson Alex dos Santos,et al.  Evaluating the Potential of Texture and Color Descriptors for Remote Sensing Image Retrieval and Classification , 2010, VISAPP.

[40]  Sylvie Philipp-Foliguet,et al.  Descriptor correlation analysis for remote sensing image multi-scale classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[41]  Fatos T. Yarman-Vural,et al.  Representation Learning for Contextual Object and Region Detection in Remote Sensing , 2014, 2014 22nd International Conference on Pattern Recognition.

[42]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[43]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[44]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[45]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[46]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.