A Method of Honey Plant Classification Based on IR Spectrum: Extract Feature Wavelength Using Genetic Algorithm and Classify Using Linear Discriminate Analysis

Bayesian linear classifier is the basic scheme to solve model classification basing on statistics. Face with the classification of three different nectar plant, the near infrared spectrum data was acquired. The character of the near infrared spectrums is known as litter sample and higher dimension. In this paper, the method has developed to acquire the feature wavelength based on genetic algorithm. It can solve the problem of the effective information extraction from the high-dimensional data matrix. The fitness function of genetic algorithm is been set to minimize the error rate of classification. The K-S algorithm was used to construct the calibration set and validation set. There are 132 samples in the calibration set and 42 samples in the validation set. The feature wavelengths were acquired respectively basing on different preprocessing. The result indicates using the 10 feature wavelengths based on raw data can obtain best resolution compare with the principal component analysis –linear discriminate analysis model. The result indicated that the GA-LDA classifier can made the model to be simplified and the correction rate can be increased evidently after using the feature wavelength.

[1]  R. Boggia,et al.  Genetic algorithms as a strategy for feature selection , 1992 .

[2]  Riccardo Leardi,et al.  Extraction of representative subsets by potential functions method and genetic algorithms , 1998 .

[3]  W. Melssen,et al.  Selecting a representative training set for the classification of demolition waste using remote NIR sensing , 1999 .

[4]  Alejandro C. Olivieri,et al.  Wavelength Selection for Multivariate Calibration Using a Genetic Algorithm: A Novel Initialization Strategy , 2002, J. Chem. Inf. Comput. Sci..

[5]  M. P. Callao,et al.  Monitoring ethylene content in heterophasic copolymers by near-infrared spectroscopy: Standardisation of the calibration model , 2001 .

[6]  R. Barnes,et al.  Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra , 1989 .

[7]  M A Arnold,et al.  Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression: application to near-infrared spectroscopy. , 1996, Analytical chemistry.

[8]  J. Zupan,et al.  Separation of data on the training and test set for modelling: a case study for modelling of five colour properties of a white pigment , 2003 .

[9]  Gerrit Kateman,et al.  Optimization of calibration data with the dynamic genetic algorithm , 1992 .

[10]  Dong Wang,et al.  Successive projections algorithm combined with uninformative variable elimination for spectral variable selection , 2008 .

[11]  Yan Wang,et al.  [Application of wavelength selection algorithm to measure the effective component of Chinese medicine based on near-infrared spectroscopy]. , 2006, Guang pu xue yu guang pu fen xi = Guang pu.

[12]  P. A. Gorry General least-squares smoothing and differentiation by the convolution (Savitzky-Golay) method , 1990 .