Using a genetic algorithm with histogram-based feature selection in hyperspectral image classification

Optical sensing has the potential to be an important tool in the automated monitoring of food quality. Specifically, hyperspectral imaging has enjoyed success in a variety of tasks ranging from plant species classification to ripeness evaluation in produce. Although effective, hyperspectral imaging is prohibitively expensive to deploy at scale in a retail setting. With this in mind, we develop a method to assist in designing a low-cost multispectral imager for produce monitoring by using a genetic algorithm (GA) that simultaneously selects a subset of informative wavelengths and identifies effective filter bandwidths for such an imager. Instead of selecting the single fittest member of the final population as our solution, we fit a univariate Gaussian mixture model to the histogram of the overall GA population, selecting the wavelengths associated with the peaks of the distributions as our solution. By evaluating the entire population, rather than a single solution, we are also able to specify filter bandwidths by calculating the standard deviations of the Gaussian distributions and computing the full-width at half-maximum values. In our experiments, we find that this novel histogram-based method for feature selection is effective when compared to both the standard GA and partial least squares discriminant analysis.

[1]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[2]  Colm P. O'Donnell,et al.  Hyperspectral imaging – an emerging process analytical tool for food quality and safety control , 2007 .

[3]  Ning Wang,et al.  Studies on banana fruit quality and maturity stages using hyperspectral imaging , 2012 .

[4]  Jonathan Cheung-Wai Chan,et al.  Evaluation of random forest and adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery , 2008 .

[5]  Johannes Brauers,et al.  Multispectral Filter-Wheel Cameras: Geometric Distortion Model and Compensation Algorithms , 2008, IEEE Transactions on Image Processing.

[6]  K.Z. Mao,et al.  Orthogonal forward selection and backward elimination algorithms for feature subset selection , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[8]  R. Lu Multispectral imaging for predicting firmness and soluble solids content of apple fruit , 2004 .

[9]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[10]  J. Buzby,et al.  The Estimated Amount, Value, and Calories of Postharvest Food Losses at the Retail and Consumer Levels in the United States , 2014 .

[11]  Li Zhuo,et al.  A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine , 2008, Geoinformatics.

[12]  Gamal ElMasry,et al.  Non-destructive prediction and visualization of chemical composition in lamb meat using NIR hyperspectral imaging and multivariate regression , 2012 .

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[15]  Gamal Elmasry,et al.  Near-infrared hyperspectral imaging and partial least squares regression for rapid and reagentless determination of Enterobacteriaceae on chicken fillets. , 2013, Food chemistry.

[16]  M. F. Baumgardner,et al.  220 Band AVIRIS Hyperspectral Image Data Set: June 12, 1992 Indian Pine Test Site 3 , 2015 .

[17]  G. Bonifazi,et al.  Classification of Peronospora infected grapevine leaves with the use of hyperspectral imaging analysis , 2017, Commercial + Scientific Sensing and Imaging.

[18]  Elfatih M. Abdel-Rahman,et al.  Random forest regression and spectral band selection for estimating sugarcane leaf nitrogen concentration using EO-1 Hyperion hyperspectral data , 2013 .

[19]  Scott A Mathews,et al.  Design and fabrication of a low-cost, multispectral imaging system. , 2008, Applied optics.

[20]  Hao Wu,et al.  An effective feature selection method for hyperspectral image classification based on genetic algorithm and support vector machine , 2011, Knowl. Based Syst..

[21]  Bart Nicolai,et al.  Non-destructive measurement of bitter pit in apple fruit using NIR hyperspectral imaging , 2006 .

[22]  Da-Wen Sun,et al.  NIR hyperspectral imaging as non-destructive evaluation tool for the recognition of fresh and frozen–thawed porcine longissimus dorsi muscles , 2013 .

[23]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[24]  Seyyid Ahmed Medjahed,et al.  Band selection based on optimization approach for hyperspectral image classification , 2018, The Egyptian Journal of Remote Sensing and Space Science.

[25]  Sumio Kawano,et al.  Improvement of PLS Calibration for Brix Value and Dry Matter of Mango Using Information from MLR Calibration , 2001 .

[26]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[27]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[28]  Jun-Hu Cheng,et al.  Combining the genetic algorithm and successive projection algorithm for the selection of feature wavelengths to evaluate exudative characteristics in frozen-thawed fish muscle. , 2016, Food chemistry.

[29]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[30]  Vasilis Ntziachristos,et al.  Multispectral imaging using multiple-bandpass filters. , 2008, Optics letters.

[31]  Gerrit Polder,et al.  Hyperspectral image analysis for measuring ripeness of tomatoes. , 2000 .

[32]  Sadegh Karimpouli,et al.  Using a Feature Subset Selection method and Support Vector Machine to address curse of dimensionality and redundancy in Hyperion hyperspectral data classification , 2017 .

[33]  Hossein Safari,et al.  A hybrid algorithm for feature subset selection in high-dimensional datasets using FICA and IWSSr algorithm , 2015, Appl. Soft Comput..

[34]  Adel A. Kader,et al.  Increasing Food Availability by Reducing Postharvest Losses of Fresh Produce , 2005 .

[35]  J. Abbott Quality measurement of fruits and vegetables , 1999 .

[36]  Fang Liu,et al.  Unsupervised feature selection based on maximum information and minimum redundancy for hyperspectral images , 2016, Pattern Recognit..

[37]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[38]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..