Sensitivity of Support Vector Machines to Random Feature Selection in Classification of Hyperspectral Data

The accuracy of supervised land cover classifications depends on factors such as the chosen classification algorithm, adequate training data, the input data characteristics, and the selection of features. Hyperspectral imaging provides more detailed spectral and spatial information on the land cover than other remote sensing resources. Over the past ten years, traditional and formerly widely accepted statistical classification methods have been superseded by more recent machine learning algorithms, e.g., support vector machines (SVMs), or by multiple classifier systems (MCS). This can be explained by limitations of statistical approaches with regard to high-dimensional data, multimodal classes, and often limited availability of training data. In the presented study, MCSs based on SVM and random feature selection (RFS) are applied to explore the potential of a synergetic use of the two concepts. We investigated how the number of selected features and the size of the MCS influence classification accuracy using two hyperspectral data sets, from different environmental settings. In addition, experiments were conducted with a varying number of training samples. Accuracies are compared with regular SVM and random forests. Experimental results clearly demonstrate that the generation of an SVM-based classifier system with RFS significantly improves overall classification accuracy as well as producer's and user's accuracies. In addition, the ensemble strategy results in smoother, i.e., more realistic, classification maps than those from stand-alone SVM. Findings from the experiments were successfully transferred onto an additional hyperspectral data set.

[1]  Jon Atli Benediktsson,et al.  Classification of multisource and hyperspectral data based on decision fusion , 1999, IEEE Trans. Geosci. Remote. Sens..

[2]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[3]  Björn Waske,et al.  Classifying Multilevel Imagery From SAR and Optical Sensors by Decision Fusion , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Lorenzo Bruzzone,et al.  Fusion of Hyperspectral and LIDAR Remote Sensing Data for Classification of Complex Forest Areas , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Lorenzo Bruzzone,et al.  Semisupervised Classification of Hyperspectral Images by SVMs Optimized in the Primal , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Sebastiano B. Serpico,et al.  A SVM ensemble approach for spectral-contextual classification of optical high spatial resolution imagery , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[8]  Ian Witten,et al.  Data Mining , 2000 .

[9]  Martin Herold,et al.  Spectral resolution requirements for mapping urban areas , 2003, IEEE Trans. Geosci. Remote. Sens..

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Lin Ma,et al.  Empirical analysis of support vector machine ensemble classifiers , 2009, Expert Syst. Appl..

[12]  L. S. Davis,et al.  An assessment of support vector machines for land cover classi(cid:142) cation , 2002 .

[13]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[14]  Patrick Hostert,et al.  Classifying segmented hyperspectral data from a heterogeneous urban environment using support vector machines , 2007 .

[15]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[16]  Jonathan Cheung-Wai Chan,et al.  Evaluation of random forest and adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery , 2008 .

[17]  Johannes R. Sveinsson,et al.  Multiple classifiers applied to multisource remote sensing data , 2002, IEEE Trans. Geosci. Remote. Sens..

[18]  Liangpei Zhang,et al.  An Adaptive Mean-Shift Analysis Approach for Object Extraction and Classification From Urban Hyperspectral Imagery , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[19]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[20]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[21]  José Luis Rojo-Álvarez,et al.  Kernel-Based Framework for Multitemporal and Multisource Remote Sensing Data Classification and Change Detection , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Jon Atli Benediktsson,et al.  Fusion of Support Vector Machines for Classification of Multisensor Data , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Björn Waske,et al.  Random Feature Selection for Decision Tree Classification of Multi-temporal SAR Data , 2006, 2006 IEEE International Symposium on Geoscience and Remote Sensing.

[24]  Jon Atli Benediktsson,et al.  Multiple Classifier Systems in Remote Sensing: From Basics to Recent Developments , 2007, MCS.

[25]  Giles M. Foody,et al.  A relative evaluation of multiclass image classification by support vector machines , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Alexander F. H. Goetz,et al.  Three decades of hyperspectral remote sensing of the Earth: a personal view. , 2009 .

[27]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[28]  Cheng Wang,et al.  Using Stacked Generalization to Combine SVMs in Magnitude and Shape Feature Spaces for Classification of Hyperspectral Data , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[29]  Johannes R. Sveinsson,et al.  Spectral and spatial classification of hyperspectral data using SVMs and morphological profiles , 2008, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  Johannes R. Sveinsson,et al.  Random Forests for land cover classification , 2006, Pattern Recognit. Lett..

[32]  Ioannis Pitas,et al.  Demonstrating the stability of support vector machines for classification , 2006, Signal Process..

[33]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[34]  Dominic Mazzoni,et al.  Multiclass reduced-set support vector machines , 2006, ICML.

[35]  Joydeep Ghosh,et al.  Investigation of the random forest framework for classification of hyperspectral data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[36]  Lorenzo Bruzzone,et al.  Classification of hyperspectral remote sensing images with support vector machines , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[37]  Hyun-Chul Kim,et al.  Constructing support vector machine ensemble , 2003, Pattern Recognit..

[38]  Jon Atli Benediktsson,et al.  Recent Advances in Techniques for Hyperspectral Image Processing , 2009 .

[39]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Farid Melgani,et al.  A Multiobjective Genetic SVM Approach for Classification Problems With Limited Training Samples , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[41]  Zhi-Hua Zhou,et al.  When semi-supervised learning meets ensemble learning , 2009, MCS.