Multi-objective Genetic Algorithm for Variable Selection in Multivariate Classification Problems: A Case Study in Verification of Biodiesel Adulteration

This paper proposes multi-objective genetic algorithm for the problem of variable selection in multivariate calibration. We consider the problem related to the classification of biodiesel samples to detect adulteration, Linear Discriminant Analysis classifier. The goal of the multi--objective algorithm is to reduce the dimensionality of the original set of variables; thus, the classification model can be less sensitive, providing a better generalization capacity. In particular, in this paper we adopted a version of the Non-dominated Sorting Genetic Algorithm (NSGA-II) and compare it to a mono-objective Genetic Algorithm (GA) in terms of sensitivity in the presence of noise. Results show that the mono-objective selects 20 variables on average and presents an error rate of 14%. One the other hand, the multi-objective selects 7 variables and has an error rate of 11%. Consequently, we show that the multi-objective formulation provides classification models with lower sensitivity to the instrumental noise when compared to the mono-objetive formulation

[1]  Roberto Kawakami Harrop Galvão,et al.  UV–Vis spectrometric classification of coffees by SPA–LDA , 2010 .

[2]  R. Fisher THE PRECISION OF DISCRIMINANT FUNCTIONS , 1940 .

[3]  Robert R. Meglen Chemometrics: Its role in chemistry and measurement sciences , 1988 .

[4]  Maria Fernanda Pimentel,et al.  Classification of blue pen ink using infrared spectroscopy and linear discriminant analysis , 2013 .

[5]  Barry K. Lavine,et al.  Raman Spectroscopy and Genetic Algorithms for the Classification of Wood Types , 2001 .

[6]  Gilbert Syswerda,et al.  Uniform Crossover in Genetic Algorithms , 1989, ICGA.

[7]  Peter Filzmoser,et al.  Multiple group linear discriminant analysis: robustness and error rate , 2006 .

[8]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[9]  Zou Xiaobo,et al.  Variables selection methods in near-infrared spectroscopy. , 2010, Analytica chimica acta.

[10]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[11]  G. Celeux,et al.  Regularization in discriminant analysis: an overview , 1997 .

[12]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[13]  D. Coomans,et al.  Optimization by statistical linear discriminant analysis in analytical chemistry , 1979 .

[14]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[15]  Tao Li,et al.  Using discriminant analysis for multi-class classification: an experimental investigation , 2006, Knowledge and Information Systems.

[16]  Jieping Ye,et al.  Least squares linear discriminant analysis , 2007, ICML '07.

[17]  DebKalyanmoy Multi-objective genetic algorithms , 1999 .

[18]  Richard G. Brereton,et al.  Applied Chemometrics for Scientists , 2007 .

[19]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[20]  Michel Verleysen,et al.  Fast Selection of Spectral Variables with B-Spline Compression , 2007, ArXiv.

[21]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[22]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[23]  Maria Fernanda Pimentel,et al.  Screening analysis to detect adulteration in diesel/biodiesel blends using near infrared spectrometry and multivariate classification. , 2011, Talanta.

[24]  W. Spears,et al.  On the Virtues of Parameterized Uniform Crossover , 1995 .

[25]  W. V. McCarthy,et al.  Discriminant Analysis with Singular Covariance Matrices: Methods and Applications to Spectroscopic Data , 1995 .

[26]  Bruce R. Kowalski,et al.  Recent developments in multivariate calibration , 1991 .

[27]  H. Mark Chemometrics in near-infrared spectroscopy , 1989 .

[28]  Tormod Næs,et al.  Understanding the collinearity problem in regression and discriminant analysis , 2001 .