Ensemble Learning in Hyperspectral Image Classification: Toward Selecting a Favorable Bias-Variance Tradeoff

Automated classification of hyperspectral images is a fast growing field with numerous applications in the areas of security and surveillance, agriculture, urban management, and environmental monitoring. Although significant progress has been achieved in various aspects of hyperspectral classification (e.g., feature extraction, feature selection, classification, and post-classification processing), the problem has not been addressed so far from a bias-variance decomposition point of view. In this work, we introduce a consistent unified framework that jointly considers all steps in the hyperspectral image classification chain from a bias-variance decomposition perspective. Additionally, we show how state-of-the-art techniques in feature extraction, ensemble-based classification, and post-classification segmentation are related to the bias-variance tradeoff and how this relation can be used to improve classification accuracy. An important outcome of our analysis is that all the steps of the classification chain should be optimized jointly as this unified optimization can guide toward a more favorable bias-variance tradeoff. Experimental results of the proposed framework in the case of four hyperspectral datasets prove the effectiveness of our approach.

[1]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[2]  Bangti Jin,et al.  Augmented Tikhonov regularization , 2009 .

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  P. van der Putten,et al.  A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000 , 2004 .

[5]  Bor-Chen Kuo,et al.  Nonparametric weighted feature extraction for classification , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Joydeep Ghosh,et al.  Investigation of the random forest framework for classification of hyperspectral data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Gustavo Camps-Valls,et al.  Urban Image Classification With Semisupervised Multiscale Cluster Kernels , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[8]  Cheng-Chien Liu,et al.  Processing of FORMOSAT-2 Daily Revisit Imagery for Site Surveillance , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[9]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[10]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[11]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[12]  Ping Zhong,et al.  Multiple-Spectral-Band CRFs for Denoising Junk Bands of Hyperspectral Imagery , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Jocelyn Chanussot,et al.  Hyperspectral image classification based on spectral and geometrical features , 2009, 2009 IEEE International Workshop on Machine Learning for Signal Processing.

[14]  Mikhail F. Kanevski,et al.  SVM-Based Boosting of Active Learning Strategies for Efficient Domain Adaptation , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[15]  Amit Banerjee,et al.  Hyperspectral video for illumination-invariant tracking , 2009, 2009 First Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing.

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[18]  Aleksandra Pizurica,et al.  Classification of Hyperspectral Data Over Urban Areas Using Directional Morphological Profiles and Semi-Supervised Feature Extraction , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[19]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[20]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[21]  Nelson D. A. Mascarenhas,et al.  SAR image filtering with the ICM algorithm , 1994, Proceedings of IGARSS '94 - 1994 IEEE International Geoscience and Remote Sensing Symposium.

[22]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[23]  Robin Genuer,et al.  Random Forests: some methodological insights , 2008, 0811.3619.

[24]  John R. Miller,et al.  Hyperspectral vegetation indices and novel algorithms for predicting green LAI of crop canopies: Modeling and validation in the context of precision agriculture , 2004 .

[25]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[26]  Yonina C. Eldar Minimum variance in biased estimation: bounds and asymptotically optimal estimators , 2004, IEEE Transactions on Signal Processing.

[27]  Saurabh Prasad,et al.  Automated hyperspectral imagery analysis via support vector machines based multi-classifier system with non-uniform random feature selection , 2011, 2011 IEEE International Geoscience and Remote Sensing Symposium.

[28]  Luc Devroye,et al.  Consistency of Random Forests and Other Averaging Classifiers , 2008, J. Mach. Learn. Res..

[29]  Chein-I Chang,et al.  Automatic spectral target recognition in hyperspectral imagery , 2003 .

[30]  Silvia Serranti,et al.  Dried fruits quality assessment by hyperspectral imaging , 2012, Defense + Commercial Sensing.

[31]  Silvia Serranti,et al.  Characterization of post-consumer polyolefin wastes by hyperspectral imaging for quality control in recycling processes. , 2011, Waste management.

[32]  Martin Chamberland,et al.  Chemical agent detection and identification with a hyperspectral imaging infrared sensor , 2007, SPIE Organic Photonics + Electronics.

[33]  Luis Gómez-Chova,et al.  Explicit signal to noise ratio in reproducing kernel Hilbert spaces , 2011, 2011 IEEE International Geoscience and Remote Sensing Symposium.

[34]  Joseph O'Sullivan Integrating Initialization Bias and Search Bias in Neural Network Learning , 1996 .

[35]  Yong Pang,et al.  Fusion of airborne hyperspectral and LiDAR data for tree species classification in the temperate forest of northeast China , 2011, 2011 19th International Conference on Geoinformatics.

[36]  Manuel Graña,et al.  Two lattice computing approaches for the unsupervised segmentation of hyperspectral images , 2009, Neurocomputing.

[37]  Gareth James,et al.  Variance and Bias for General Loss Functions , 2003, Machine Learning.

[38]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[39]  P. W. Scott,et al.  Evaluation of hyperspectral remote sensing as a means of environmental monitoring in the St. Austell China clay (kaolin) region, Cornwall, UK , 2004 .

[40]  Lorenzo Bruzzone,et al.  Fusion of Hyperspectral and LIDAR Remote Sensing Data for Classification of Complex Forest Areas , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[41]  Jon Atli Benediktsson,et al.  Segmentation and classification of hyperspectral images using watershed transformation , 2010, Pattern Recognit..

[42]  S. Billings,et al.  Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007 .

[43]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[44]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[45]  Jocelyn Chanussot,et al.  Foreword to the Special Issue on Hyperspectral Image and Signal Processing , 2010, IEEE Trans. Geosci. Remote. Sens..

[46]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[47]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Liangpei Zhang,et al.  An SVM Ensemble Approach Combining Spectral, Structural, and Semantic Features for the Classification of High-Resolution Remotely Sensed Imagery , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[49]  Bangti Jin,et al.  A new approach to nonlinear constrained Tikhonov regularization , 2011, 1109.0654.

[50]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[51]  P. Switzer,et al.  A transformation for ordering multispectral data in terms of image quality with implications for noise removal , 1988 .

[52]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[53]  Antonio J. Plaza,et al.  Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[54]  Pedro M. Domingos A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.

[55]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[56]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[57]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[58]  Boualem Boashash,et al.  Algorithms for instantaneous frequency estimation: a comparative study , 1990 .

[59]  Allan Aasbjerg Nielsen,et al.  Kernel Maximum Autocorrelation Factor and Minimum Noise Fraction Transformations , 2011, IEEE Transactions on Image Processing.

[60]  Abdelhak M. Zoubir,et al.  Resampling methods for quality assessment of classifier performance and optimal number of features , 2013, Signal Process..

[61]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[62]  Guangyi Chen,et al.  Denoising of Hyperspectral Imagery Using Principal Component Analysis and Wavelet Shrinkage , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[63]  A Tikhonov,et al.  Solution of Incorrectly Formulated Problems and the Regularization Method , 1963 .

[64]  Jon Atli Benediktsson,et al.  Advances in Spectral-Spatial Classification of Hyperspectral Images , 2013, Proceedings of the IEEE.

[65]  Terry Windeatt,et al.  An Ensemble Dependence Measure , 2007, ICANN.

[66]  Hsuan Ren,et al.  Nonparametric weighted feature extraction for noise whitening least squares , 2007, SPIE Optics East.

[67]  Abdelhak M. Zoubir,et al.  Target Discrimination and Classification in Through-the-Wall Radar Imaging , 2011, IEEE Transactions on Signal Processing.

[68]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[70]  Hao Chen,et al.  Data Fusion Study Between Polarimetric SAR, Hyperspectral and Lidar Data for Forest Information , 2008, IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium.

[71]  Luc Devroye,et al.  On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification , 2010, J. Multivar. Anal..

[72]  J. G. Lyon,et al.  Hyperspectral Remote Sensing of Vegetation , 2011 .

[73]  Rui Zhang,et al.  Feature selection for hyperspectral data based on modified recursive support vector machines , 2009, 2009 IEEE International Geoscience and Remote Sensing Symposium.

[74]  L. Ceriani,et al.  The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini , 2012 .

[75]  Gerhard Winkler,et al.  Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction , 2002 .

[76]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[77]  Jon Atli Benediktsson,et al.  Segmentation and Classification of Hyperspectral Images Using Minimum Spanning Forest Grown From Automatically Selected Markers , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[78]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[79]  E. Hirsch,et al.  Detection of Gaseous Plumes in IR Hyperspectral Images—Performance Analysis , 2010, IEEE Sensors Journal.

[80]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[81]  Jon Atli Benediktsson,et al.  Classification of Hyperspectral Images by Using Extended Morphological Attribute Profiles and Independent Component Analysis , 2011, IEEE Geoscience and Remote Sensing Letters.

[82]  S Matteoli,et al.  A tutorial overview of anomaly detection in hyperspectral images , 2010, IEEE Aerospace and Electronic Systems Magazine.

[83]  Colm P. O'Donnell,et al.  Hyperspectral imaging – an emerging process analytical tool for food quality and safety control , 2007 .

[84]  J. Chanussot,et al.  Hyperspectral Remote Sensing Data Analysis and Future Challenges , 2013, IEEE Geoscience and Remote Sensing Magazine.

[85]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[86]  Eyal Ben-Dor,et al.  Fusion of hyperspectral images and LiDAR data for civil engineering structure monitoring , 2010, 2010 2nd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing.

[87]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[88]  Alfred O. Hero,et al.  Exploring estimator bias-variance tradeoffs using the uniform CR bound , 1996, IEEE Trans. Signal Process..

[89]  Antonio J. Plaza,et al.  A Quantitative and Comparative Assessment of Unmixing-Based Feature Extraction Techniques for Hyperspectral Image Classification , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[90]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.