Learning Relevant Image Features With Multiple-Kernel Classification

The increase in spatial and spectral resolution of the satellite sensors, along with the shortening of the time-revisiting periods, has provided high-quality data for remote sensing image classification. However, the high-dimensional feature space induced by using many heterogeneous information sources precludes the use of simple classifiers: thus, a proper feature selection is required for discarding irrelevant features and adapting the model to the specific problem. This paper proposes to classify the images and simultaneously to learn the relevant features in such high-dimensional scenarios. The proposed method is based on the automatic optimization of a linear combination of kernels dedicated to different meaningful sets of features. Such sets can be groups of bands, contextual or textural features, or bands acquired by different sensors. The combination of kernels is optimized through gradient descent on the support vector machine objective function. Even though the combination is linear, the ranked relevance takes into account the intrinsic nonlinearity of the data through kernels. Since a naive selection of the free parameters of the multiple-kernel method is computationally demanding, we propose an efficient model selection procedure based on the kernel alignment. The result is a weight (learned from the data) for each kernel where both relevant and meaningless image features automatically emerge after training the model. Experiments carried out in multi- and hyperspectral, contextual, and multisource remote sensing data classification confirm the capability of the method in ranking the relevant features and show the computational efficience of the proposed strategy.

[1]  Ankita Kumar,et al.  Support Kernel Machines for Object Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Lorenzo Bruzzone,et al.  Kernel-based methods for hyperspectral image classification , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[3]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  Jon Atli Benediktsson,et al.  Classification and feature extraction for remote sensing images from urban areas based on morphological transformations , 2003, IEEE Trans. Geosci. Remote. Sens..

[5]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[6]  Rick Archibald,et al.  Feature Selection and Classification of Hyperspectral Images With Support Vector Machines , 2007, IEEE Geoscience and Remote Sensing Letters.

[7]  José Luis Rojo-Álvarez,et al.  Kernel-Based Framework for Multitemporal and Multisource Remote Sensing Data Classification and Change Detection , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Johannes R. Sveinsson,et al.  Spectral and spatial classification of hyperspectral data using SVMs and morphological profiles , 2008, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[9]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[10]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[11]  Lorenzo Bruzzone,et al.  An iterative technique for the detection of land-cover transitions in multitemporal remote-sensing images , 1997, IEEE Trans. Geosci. Remote. Sens..

[12]  Jon Atli Benediktsson,et al.  A joint spatial and spectral SVM’s classification of panchromatic images , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[13]  Liangpei Zhang,et al.  Dimensionality Reduction Based on Clonal Selection for Hyperspectral Imagery , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[15]  Lorenzo Bruzzone,et al.  Classification of hyperspectral remote sensing images with support vector machines , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[16]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[17]  T. A. Warner,et al.  An evaluation of spatial autocorrelation feature selection , 1999 .

[18]  Peng Zhang,et al.  Dynamic Learning of SMLR for Feature Selection and Classification of Hyperspectral Data , 2008, IEEE Geoscience and Remote Sensing Letters.

[19]  Jon Atli Benediktsson,et al.  Fusion of Support Vector Machines for Classification of Multisensor Data , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[20]  M. Santoro,et al.  Understanding ERS Coherence over Urban Areas , 2000 .

[21]  Lorenzo Bruzzone,et al.  A partially unsupervised cascade classifier for the analysis of multitemporal remote-sensing images , 2002, Pattern Recognit. Lett..

[22]  Bor-Chen Kuo,et al.  Hyperspectral Image Classification Using Kernel-based Nonparametric Weighted Feature Extraction , 2006, 2006 IEEE International Symposium on Geoscience and Remote Sensing.

[23]  Gustavo Camps-Valls,et al.  Multisource Composite Kernels for Urban-Image Classification , 2010, IEEE Geoscience and Remote Sensing Letters.

[24]  William J. Emery,et al.  Classification of Very High Spatial Resolution Imagery Using Mathematical Morphology and Support Vector Machines , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[26]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[27]  Farid Melgani,et al.  Toward an Optimal SVM Classification System for Hyperspectral Remote Sensing Images , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[28]  S.V.M. Vishwanathan,et al.  SSVM: a simple SVM algorithm , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[29]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[30]  T. Glasmachers,et al.  Gradient-Based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[31]  Luis Alonso,et al.  Robust support vector method for hyperspectral data classification and knowledge discovery , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[32]  S. Ertürk,et al.  Phase correlation based redundancy removal in feature weighting band selection for hyperspectral images , 2008 .

[33]  Gustavo Camps-Valls,et al.  Composite kernels for hyperspectral image classification , 2006, IEEE Geoscience and Remote Sensing Letters.

[34]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[35]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[36]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[37]  William J. Emery,et al.  A neural network approach using multi-scale textural metrics from very high-resolution panchromatic imagery for urban land-use classification , 2009 .

[38]  G. Foody Thematic map comparison: Evaluating the statistical significance of differences in classification accuracy , 2004 .

[39]  Lorenzo Bruzzone,et al.  A Composite Semisupervised SVM for Classification of Hyperspectral Images , 2009, IEEE Geoscience and Remote Sensing Letters.

[40]  J. Borak Feature selection and land cover classification of a MODIS-like data set for a semiarid environment , 1999 .

[41]  Bor-Chen Kuo,et al.  Kernel Nonparametric Weighted Feature Extraction for Hyperspectral Image Classification , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Luis Gómez-Chova,et al.  Urban monitoring using multi-temporal SAR and multi-spectral data , 2006, Pattern Recognit. Lett..

[43]  Lorenzo Bruzzone,et al.  Detection of land-cover transitions by combining multidate classifiers , 2004, Pattern Recognit. Lett..

[44]  David A. Clausi,et al.  Comparison and fusion of co‐occurrence, Gabor and MRF texture features for classification of SAR sea‐ice imagery , 2001 .

[45]  Lorenzo Bruzzone,et al.  A technique for feature selection in multiclass problems , 2000 .

[46]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[47]  Jon Atli Benediktsson,et al.  Gradient Optimization for multiple kernel's parameters in support vector machines classification , 2008, IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium.

[48]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[49]  Antonio J. Plaza,et al.  Dimensionality reduction and classification of hyperspectral image data using sequences of extended morphological transformations , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[50]  Cheng Soon Ong,et al.  Multiclass multiple kernel learning , 2007, ICML '07.

[51]  Surya S. Durbha,et al.  Wrapper-Based Feature Subset Selection for Rapid Image Information Mining , 2010, IEEE Geoscience and Remote Sensing Letters.

[52]  Robert I. Damper,et al.  Customizing Kernel Functions for SVM-Based Hyperspectral Image Classification , 2008, IEEE Transactions on Image Processing.

[53]  Paul M. Mather,et al.  The role of feature selection in artificial neural network applications , 2002 .