Bayesian Hyperspectral Image Segmentation With Discriminative Class Learning

This paper introduces a new supervised technique to segment hyperspectral images: the Bayesian segmentation based on discriminative classification and on multilevel logistic (MLL) spatial prior. The approach is Bayesian and exploits both spectral and spatial information. Given a spectral vector, the posterior class probability distribution is modeled using multinomial logistic regression (MLR) which, being a discriminative model, allows to learn directly the boundaries between the decision regions and, thus, to successfully deal with high-dimensionality data. To control the machine complexity and, thus, its generalization capacity, the prior on the multinomial logistic vector is assumed to follow a componentwise independent Laplacian density. The vector of weights is computed via the fast sparse multinomial logistic regression (FSMLR), a variation of the sparse multinomial logistic regression (SMLR), conceived to deal with large data sets beyond the reach of the SMLR. To avoid the high computational complexity involved in estimating the Laplacian regularization parameter, we have also considered the Jeffreys prior, as it does not depend on any hyperparameter. The prior probability distribution on the class-label image is an MLL Markov-Gibbs distribution, which promotes segmentation results with equal neighboring class labels. The -expansion optimization algorithm, a powerful graph-cut-based integer optimization tool, is used to compute the maximum a posteriori segmentation. The effectiveness of the proposed methodology is illustrated by comparing its performance with the state-of-the-art methods on synthetic and real hyperspectral image data sets. The reported results give clear evidence of the relevance of using both spatial and spectral information in hyperspectral image segmentation.

[1]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Santiago Velasco-Forero,et al.  Improving Hyperspectral Image Classification Using Spatial Preprocessing , 2009, IEEE Geoscience and Remote Sensing Letters.

[3]  David A. Landgrebe,et al.  Signal Theory Methods in Multispectral Remote Sensing , 2003 .

[4]  Fuan Tsai,et al.  Spectrally segmented principal component analysis of hyperspectral imagery for mapping invasive plant species , 2007 .

[5]  Robert D. Nowak,et al.  Wavelet-based image estimation: an empirical Bayes approach using Jeffrey's noninformative prior , 2001, IEEE Trans. Image Process..

[6]  D. Böhning Multinomial logistic regression algorithm , 1992 .

[7]  Johannes R. Sveinsson,et al.  Spectral and spatial classification of hyperspectral data using SVMs and morphological profiles , 2008, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[8]  José M. Bioucas-Dias,et al.  Does independent component analysis play a role in unmixing hyperspectral data? , 2003, IEEE Transactions on Geoscience and Remote Sensing.

[9]  José M. Bioucas-Dias,et al.  Evaluation of bayesian hyperspectral image segmentation with a discriminative class learning , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[10]  Lorenzo Bruzzone,et al.  A Support Vector Domain Description Approach to Supervised Classification of Remote Sensing Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Ivan Lizarazo,et al.  SVM‐based segmentation and classification of remotely sensed data , 2008 .

[12]  Lorenzo Bruzzone,et al.  Classification of Hyperspectral Images With Regularized Linear Discriminant Analysis , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Jon Atli Benediktsson,et al.  SVM- and MRF-Based Method for Accurate Classification of Hyperspectral Images , 2010, IEEE Geoscience and Remote Sensing Letters.

[14]  Endre Boros,et al.  A max-flow approach to improved lower bounds for quadratic unconstrained binary optimization (QUBO) , 2008, Discret. Optim..

[15]  André R. S. Marçal,et al.  Hyperspectral image segmentation using FSMLR with Jeffreys prior , 2009 .

[16]  J. R. Jensen Remote Sensing of the Environment: An Earth Resource Perspective , 2000 .

[17]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[18]  Jocelyn Chanussot,et al.  Decision Fusion for the Classification of Hyperspectral Data: Outcome of the 2008 GRS-S Data Fusion Contest , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  Martial Hebert,et al.  Discriminative Random Fields , 2006, International Journal of Computer Vision.

[21]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[22]  Antonio J. Plaza,et al.  Dimensionality reduction and classification of hyperspectral image data using sequences of extended morphological transformations , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[24]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  José M. Bioucas-Dias,et al.  Fast Sparse Multinomial Regression Applied to Hyperspectral Data , 2006, ICIAR.

[27]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[28]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[29]  D. Hunter,et al.  A Tutorial on MM Algorithms , 2004 .

[30]  R. Steele Optimization , 2005 .

[31]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[32]  Joachim M. Buhmann,et al.  Unsupervised Texture Segmentation in a Deterministic Annealing Framework , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[35]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[37]  Ravi Bansal,et al.  Segmentation of Dynamic N-D Data Sets via Graph Cuts Using Markov Models , 2001, MICCAI.

[38]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Lorenzo Bruzzone,et al.  Kernel-based methods for hyperspectral image classification , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[40]  Paul M. Mather,et al.  Classification of multisource remote sensing imagery using a genetic algorithm and Markov random fields , 1999, IEEE Trans. Geosci. Remote. Sens..

[41]  Luísa Castro,et al.  Hierarchical clustering of multispectral images using combined spectral and spatial criteria , 2005, IEEE Geoscience and Remote Sensing Letters.

[42]  Jon Atli Benediktsson,et al.  Advanced processing of hyperspectral images , 2006, 2006 IEEE International Symposium on Geoscience and Remote Sensing.

[43]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[44]  Peter Strobl,et al.  HySens-DAIS/ROSIS Imaging Spectrometers at DLR , 2002, Remote Sensing.

[45]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[46]  Ping Zhong,et al.  Learning Sparse CRFs for Feature Selection and Classification of Hyperspectral Imagery , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[47]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[48]  M. Canty,et al.  Unsupervised classification of satellite imagery: Choosing a good algorithm , 2002 .

[49]  Peter M. Williams,et al.  Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.

[50]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[51]  Jonathan Cheung-Wai Chan,et al.  Improved Classification of VHR Images of Urban Areas Using Directional Morphological Profiles , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[52]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[53]  Luis Samaniego,et al.  Supervised Classification of Remotely Sensed Imagery Using a Modified $k$-NN Technique , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[54]  Jon Atli Benediktsson,et al.  Segmentation and classification of hyperspectral images using watershed transformation , 2010, Pattern Recognit..

[55]  Horst Bischof,et al.  Multispectral classification of Landsat-images using neural networks , 1992, IEEE Trans. Geosci. Remote. Sens..

[56]  Lorenzo Bruzzone,et al.  A Novel Transductive SVM for Semisupervised Classification of Remote-Sensing Images , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[57]  P. Gong,et al.  Object-based Detailed Vegetation Classification with Airborne High Spatial Resolution Remote Sensing Imagery , 2006 .

[58]  J. S. Borges,et al.  Evaluation of feature extraction and reduction methods for hyperspectral images , 2007 .

[59]  Anuj Srivastava,et al.  A Bayesian MRF framework for labeling terrain using hyperspectral imaging , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[60]  José M. Bioucas-Dias,et al.  Bayesian wavelet-based image deconvolution: a GEM algorithm exploiting a class of heavy-tailed priors , 2006, IEEE Transactions on Image Processing.

[61]  André R. S. Marçal,et al.  Estimating the Natural Number of Classes on Hierarchically Clustered Multi-spectral Images , 2005, ICIAR.

[62]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Lawrence Carin,et al.  A Bayesian approach to joint feature selection and classifier design , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[65]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[66]  Pramod K. Varshney,et al.  Enhanced ICA Mixture Model for Unsupervised Classification , 2004, IBERAMIA.

[67]  Farid Melgani,et al.  Nearest Neighbor Classification of Remote Sensing Images With the Maximal Margin Principle , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[68]  Begüm Demir,et al.  Clustering-Based Extraction of Border Training Patterns for Accurate SVM Classification of Hyperspectral Images , 2009, IEEE Geoscience and Remote Sensing Letters.

[69]  King-Sun Fu,et al.  Statistical pattern classification using contextual information , 1980 .

[70]  José M. Bioucas-Dias,et al.  Vertex component analysis: a fast algorithm to unmix hyperspectral data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[71]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[72]  Jon Atli Benediktsson,et al.  Recent Advances in Techniques for Hyperspectral Image Processing , 2009 .

[73]  Alan L. Yuille,et al.  Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for Multiband Image Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[74]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression , 2007, J. Mach. Learn. Res..

[75]  Mário A. T. Figueiredo Adaptive Sparseness Using Jeffreys Prior , 2001, NIPS.

[76]  Anil K. Jain,et al.  Bayesian learning of sparse classifiers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[77]  Jon Atli Benediktsson,et al.  Exploiting spectral and spatial information in hyperspectral urban data with high resolution , 2004, IEEE Geoscience and Remote Sensing Letters.

[78]  Lawrence Carin,et al.  Joint Classifier and Feature Optimization for Comprehensive Cancer Diagnosis Using Gene Expression Data , 2004, J. Comput. Biol..

[79]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[80]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[81]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[82]  Lorenzo Bruzzone,et al.  Remote Sensing Image Classification: A Neuro-fuzzy MCS Approach , 2006, ICVGIP.

[83]  Jitendra Malik,et al.  Normalized Cut and Image Segmentation , 1997 .

[84]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[85]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[86]  C CawleyGavin,et al.  Gene selection in cancer classification using sparse logistic regression with Bayesian regularization , 2006 .

[87]  Trevor J. Hastie,et al.  Discriminative vs Informative Learning , 1997, KDD.

[88]  Gabriele Moser,et al.  Partially Supervised classification of remote sensing images through SVM-based probability density estimation , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[89]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[90]  B. Xu,et al.  Land-use/land-cover classification with multispectral and hyperspectral EO-1 data , 2007 .

[91]  John W. Fisher,et al.  Submitted to Ieee Transactions on Image Processing a Nonparametric Statistical Method for Image Segmentation Using Information Theory and Curve Evolution , 2022 .

[92]  高等学校計算数学学報編輯委員会編 高等学校計算数学学報 = Numerical mathematics , 1979 .

[93]  Gustavo Camps-Valls,et al.  Composite kernels for hyperspectral image classification , 2006, IEEE Geoscience and Remote Sensing Letters.

[94]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[95]  Robert D. Nowak,et al.  Unsupervised progressive parsing of Poisson fields using minimum description length criteria , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[96]  Randolph H. Wynne,et al.  Comparison of segment and pixel-based non-parametric land cover classification in the brazilian amazon using multitemporal landsat TM/ETM+ imagery , 2007 .

[97]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[98]  Anne H. Schistad Solberg,et al.  Improving Hyperspectral Classifiers: The Difference Between Reducing Data Dimensionality and Reducing Classifier Parameter Complexity , 2007, SCIA.

[99]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[100]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[101]  A F Goetz,et al.  Imaging Spectrometry for Earth Remote Sensing , 1985, Science.

[102]  Johannes R. Sveinsson,et al.  Classification of hyperspectral data from urban areas based on extended morphological profiles , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[103]  T. Minka A comparison of numerical optimizers for logistic regression , 2004 .

[104]  Saroj K. Meher,et al.  Wavelet-Feature-Based Classifiers for Multispectral Remote-Sensing Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[105]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[106]  John R. Welch,et al.  A Context Algorithm for Pattern Recognition and Image Interpretation , 1971, IEEE Trans. Syst. Man Cybern..

[107]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[108]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[109]  Lorenzo Bruzzone,et al.  An ensemble-driven k-NN approach to ill-posed classification problems , 2006, Pattern Recognit. Lett..

[110]  Lorenzo Bruzzone,et al.  Fusion of spectral and spatial information by a novel SVM classification technique , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[111]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[112]  B. Lindsay,et al.  Monotonicity of quadratic-approximation algorithms , 1988 .

[113]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[114]  Haluk Derin,et al.  Modeling and Segmentation of Noisy and Textured Images Using Gibbs Random Fields , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[115]  Gustavo Camps-Valls,et al.  Semi-Supervised Graph-Based Hyperspectral Image Classification , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[116]  Joachim M. Buhmann,et al.  A minimum entropy approach to adaptive image polygonization , 2003, IEEE Trans. Image Process..

[117]  Mário A. T. Figueiredo Bayesian image segmentation using wavelet-based priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[118]  J. A. Gomes,et al.  Land cover update by supervised classification of segmented ASTER images , 2005 .

[119]  Francesca Bovolo,et al.  A Context-Sensitive Technique Based on Support Vector Machines for Image Classification , 2005, PReMI.

[120]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[121]  Emanuele Salerno,et al.  Blind spectral unmixing by local maximization of non-Gaussianity , 2008, Signal Process..

[122]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[123]  Giles M. Foody,et al.  Feature Selection for Classification of Hyperspectral Data by SVM , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[124]  Jon Atli Benediktsson,et al.  Segmentation and Classification of Hyperspectral Images Using Minimum Spanning Forest Grown From Automatically Selected Markers , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[125]  Maggi Kelly,et al.  A spatial–temporal approach to monitoring forest disease spread using multi-temporal high spatial resolution imagery , 2006 .

[126]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[127]  Aly A. Farag,et al.  A unified framework for MAP estimation in remote sensing image segmentation , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[128]  B. Demir,et al.  Improved classification and segmentation of hyperspectral images using spectral warping , 2008 .