Feature learning with a genetic algorithm for fluorescence fingerprinting of plant species

Proper feature analysis facilitates recognition by focusing the process to those characteristics of observed data that carry the most significant information for the given classification task. In this paper we address the problem of feature selection from a different point of view. Instead of searching for a feature subset out of a large set of predefined candidate features we consider the situation where, given the form of the features and an algorithm for extracting them from the data, the optimizer tunes the feature extraction parameters to improve class separability. This process of feature learning will be solved by the means of a genetic algorithm. The optimized feature set is subsequently used in a neural network classifier. The performance of the feature learning approach is demonstrated with the problem of automatic identification of plant species from their fluorescence induction curves. The general approach should also be useful with other pattern recognition problems where a priori unknown characteristics are extracted from a large feature space.

[1]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[2]  Robert F. Ling,et al.  Cluster analysis algorithms for data reduction and classification of objects , 1981 .

[3]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[4]  Mika Keränen,et al.  Automatic Plant Identification with Chlorophyll Fluorescence Fingerprinting , 2003, Precision Agriculture.

[5]  H. Utku,et al.  Application of the feature selection method to discriminate digitized wheat varieties. , 2000 .

[6]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[7]  Bart De Ketelaere,et al.  A neural network based plant classifier , 2001 .

[8]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[9]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[10]  G. Krause,et al.  Chlorophyll Fluorescence and Photosynthesis: The Basics , 1991 .

[11]  O. Nevalainen,et al.  The Kautsky curve is a built-in barcode. , 1999, Biophysical journal.

[12]  Simon Parsons,et al.  Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[13]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[14]  Nils J. Nilsson,et al.  Artificial Intelligence: A New Synthesis , 1997 .

[15]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[16]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[17]  Hongbin Zhang,et al.  Feature selection using tabu search method , 2002, Pattern Recognit..

[18]  Lazăr,et al.  Chlorophyll a fluorescence induction1 , 1999, Biochimica et biophysica acta.

[19]  Lothar Thiele,et al.  A Comparison of Selection Schemes Used in Evolutionary Algorithms , 1996, Evolutionary Computation.

[20]  H. Kautsky,et al.  Neue Versuche zur Kohlensäureassimilation , 1931, Naturwissenschaften.