IOGA: An Instance-Oriented Genetic Algorithm

Instance-based methods of classification are easy to implement, easy to explain and relatively robust. Furthermore, they have often been found in empirical studies to be competitive in accuracy with more sophisticated classification techniques (Aha et al., 1991; Weiss & Kulikowski, 1991; Fogarty, 1992; Michie et al., 1994). However, a twofold drawback of the simplest instance-based classification method (1-NNC) is that it requires the storage of all training instances and the use of all attributes or features on which those instances are measured — thus failing to exhibit the cognitive economy which is the hallmark of successful learning (Wolff, 1991). Previous researchers have proposed ways of adapting the basic 1-NNC algorithm either to select only a subset of training cases (‘prototypes’) or to discard redundant and/or ‘noisy’ attributes, but not to do both at once. The present paper describes a program (IOGA) that uses an evolutionary algorithm to select prototypical cases and relevant attributes simultaneously, and evaluates it empirically by application to a set of test problems from a variety of fields. These trials show that very considerable economization of storage can be achieved, coupled with a modest gain in accuracy.

[1]  Lawrence Davis,et al.  Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm , 1991, ICGA.

[2]  Terence C. Fogarty,et al.  Genetic selection of features for clustering and classification , 1994 .

[3]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[4]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[5]  Imre Csiszár,et al.  Topics in Information Theory , 1976 .

[6]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[7]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[8]  K. Fukunaga,et al.  Nonparametric Data Reduction , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Frederick Mosteller,et al.  Data Analysis and Regression , 1978 .

[10]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[11]  P. P. Crump Statistical Analysis: A Computer Oriented Approach (2nd Ed.) , 1982 .

[12]  C. W. Swonger SAMPLE SET CONDENSATION FOR A CONDENSED NEAREST NEIGHBOR DECISION RULE FOR PATTERN RECOGNITION , 1972 .

[13]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[14]  David J. Hand,et al.  Experiments on the edited condensed nearest neighbor rule , 1978, Inf. Sci..

[15]  R. Reznek,et al.  Detection of Alcohol-Induced Fatty Liver by Computerized Tomography , 1988, Journal of the Royal Society of Medicine.

[16]  R. Forsyth Neural learning algorithms: some empirical trials , 1990 .

[17]  Chin-Liang Chang,et al.  Finding Prototypes For Nearest Neighbor Classifiers , 1974, IEEE Transactions on Computers.

[18]  R. Forsyth Stylistic atructures: a computational approach to text classification , 1996 .

[19]  Huan Liu,et al.  Book review: Machine Learning, Neural and Statistical Classification Edited by D. Michie, D.J. Spiegelhalter and C.C. Taylor (Ellis Horwood Limited, 1994) , 1996, SGAR.

[20]  Bruce G. Batchelor,et al.  Pattern Recognition: Ideas in Practice , 1978 .

[21]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[22]  Michael Thompson,et al.  Frontiers of Pattern Recognition , 1975 .

[23]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[24]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[25]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[26]  Eric R. Ziegel,et al.  Data: A Collection of Problems From Many Fields for the Student and Research Worker , 1987 .

[27]  Julian R. Ullmann,et al.  Automatic selection of reference data for use in a nearest-neighbor method of pattern classification (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[28]  H. Riedwyl,et al.  Multivariate Statistics: A Practical Approach , 1988 .

[29]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[30]  Abdelmonem A. Afifi,et al.  Statistical Analysis: A Computer Oriented Approach. , 1973 .

[31]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[32]  Hugh B. Woodruff,et al.  An algorithm for a selective nearest neighbor decision rule (Corresp.) , 1975, IEEE Trans. Inf. Theory.

[33]  David H. Ackley,et al.  An empirical study of bit vector function optimization , 1987 .

[34]  D. Hawkins Multivariate Statistics: A Practical Approach , 1990 .

[35]  Shlomo Geva,et al.  Adaptive nearest neighbor pattern classification , 1991, IEEE Trans. Neural Networks.

[36]  B. Manly Multivariate Statistical Methods : A Primer , 1986 .

[37]  Lawrence. Davis,et al.  Handbook Of Genetic Algorithms , 1990 .

[38]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[39]  Frederick Mosteller,et al.  Applied Bayesian and classical inference : the case of the Federalist papers , 1984 .

[40]  J. Wolff Towards a theory of cognition and computing , 1991 .

[41]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[42]  Richard Forsyth,et al.  Classification by similarity: An overview of statistical methods of case-based reasoning , 1995 .

[43]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[44]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[45]  C. Darwin,et al.  On the Tendency of Species to form Varieties; and on the Perpetuation of Varieties and Species by Natural Means of Selection , 1858 .

[46]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .