A comparative evaluation of filter-based feature selection methods for hyper-spectral band selection

Band selection (dimensionality reduction) plays an essential role in hyper-spectral image processing and applications. This article presents a unified comparison framework for systematic performance comparison of filter-based feature selection models and conducts a comparative evaluation of four methods: maximal minimal associated index (MMAIQ), mutual information-based max-dependency criterion (mRMR), relief feature selection (Relief-F), and correlation-based feature selection (CFS) for hyper-spectral band selection. The evaluation is based on the performance of effectiveness, robustness, and classification accuracy, which involves five measuring indices: class separability, feature entropy, feature stability, feature redundancy, and classification accuracy. Three images acquired by different sensors were used to investigate the performance of the metrics. Experimental results show the best results for MMAIQ for all data sets in terms of used measurements, except for feature stability where mRMR and Relief-F exhibit their superiority.

[1]  L. A. Smith,et al.  Feature Subset Selection: A Correlation Based Filter Approach , 1997, ICONIP.

[2]  Andrew K. C. Wong,et al.  Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[4]  Ray Bert,et al.  Book Review: Computer Processing of Remotely-Sensed Images: An Introduction, Third Edition , by Paul M. Mather. Chichester, United Kingdom: John Wiley & Sons Ltd., 2004 , 2004 .

[5]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[6]  Wenqian Shang,et al.  A novel feature selection algorithm for text categorization , 2007, Expert Syst. Appl..

[7]  K. Price,et al.  Optimal Landsat TM band combinations and vegetation indices for discrimination of six grassland types in eastern Kansas , 2002 .

[8]  Lorenzo Bruzzone,et al.  A new search algorithm for feature selection in hyperspectral remote sensing images , 2001, IEEE Trans. Geosci. Remote. Sens..

[9]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Xi Chen,et al.  Graph-Based Feature Selection for Object-Oriented Classification in VHR Airborne Imagery , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Bo Wu,et al.  Classification of quickbird image with maximal mutual information feature selection and support vector machine , 2009 .

[12]  Paul M. Mather,et al.  The role of feature selection in artificial neural network applications , 2002 .

[13]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[14]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[16]  Lalit Kumar,et al.  Comparative assessment of the measures of thematic classification accuracy , 2007 .

[17]  Huanfeng Shen,et al.  Feature selection based on max–min-associated indices for classification of remotely sensed imagery , 2012 .

[18]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[19]  Giles M. Foody,et al.  Toward intelligent training of supervised image classifications: directing training data acquisition for SVM classification , 2004 .

[20]  Huan Liu,et al.  Feature Selection via Discretization , 1997, IEEE Trans. Knowl. Data Eng..

[21]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[22]  B. Datt,et al.  On the relationship between training sample size and data dimensionality: Monte Carlo analysis of broadband multi-temporal classification , 2005 .

[23]  Bor-Chen Kuo,et al.  Nonparametric weighted feature extraction for classification , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[25]  R. Pontius,et al.  Death to Kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment , 2011 .

[26]  Peng Zhang,et al.  Dynamic Learning of SMLR for Feature Selection and Classification of Hyperspectral Data , 2008, IEEE Geoscience and Remote Sensing Letters.

[27]  Mahesh Pal,et al.  Support vector machine‐based feature selection for land cover classification: a case study with DAIS hyperspectral data , 2006 .

[28]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[29]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .

[30]  David A. Landgrebe,et al.  Feature Extraction Based on Decision Boundaries , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  James Zijun Wang,et al.  Feature Selection in AVHRR Ocean Satellite Images by Means of Filter Methods , 2010, IEEE Transactions on Geoscience and Remote Sensing.