Knowledge-independent data mining with fine-grained parallel evolutionary algorithms

This paper illustrates the application of evolutionary algorithms (EA) to data mining problems. The objectives are to demonstrate that EA can provide a competitive general purpose data mining scheme for classification tasks without constraining the knowledge representation, and that it can be achieved reducing the amount of time required using the inherent parallel processing nature of EA. Experiments were performed with GALE, a fine-grained parallel evolutionary algorithm, on several artificial, public domain and private datasets. The empirical results suggest that EA are competitive and robust data mining schemes that scale up better than non-evolutionary well-known schemes.

[1]  L. Darrell Whitley,et al.  Serial and Parallel Genetic Algorithms as Function Optimizers , 1993, ICGA.

[2]  Alex A. Freitas,et al.  Rule Discovery with a Parallel Genetic Algorithm , 2000 .

[3]  Xavier Llorà,et al.  Evolving Agent Aggregates using Cellular Genetic Algorithms , 2000, GECCO.

[4]  H.S. Lopes,et al.  A parallel genetic algorithm for rule discovery in large databases , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[5]  Bernard Manderick,et al.  A Massively Parallel Genetic Algorithm: Implementation and First Analysis , 1991, ICGA.

[6]  Alex Alves Freitas,et al.  Mining Very Large Databases with Parallel Processing , 1997, The Kluwer International Series on Advances in Database Systems.

[7]  Kenneth A. De Jong,et al.  Learning Concept Classification Rules Using Genetic Algorithms , 1991, IJCAI.

[8]  Xavier Llorà,et al.  Inducing Partially-Defined Instances with Evolutionary Algorithms , 2001, ICML.

[9]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[10]  Ian W. Flockhart GA-MINER : Parallel Data Mining with Hierarchical Genetic Algorithms Final Report , 1995 .

[11]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[12]  Bernard Manderick,et al.  Fine-Grained Parallel Genetic Algorithms , 1989, ICGA.

[13]  Stewart W. Wilson Mining Oblique Data with XCS , 2000, IWLCS.

[14]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[15]  George G. Robertson,et al.  Parallel Implementation of Genetic Algorithms in a Classifier Rystem , 1987, ICGA.

[16]  Erick Cantú-Paz,et al.  Topologies, Migration Rates, and Multi-Population Parallel Genetic Algorithms , 1999, GECCO.

[17]  N. J. Radcliffe,et al.  GA-MINER: Parallel Data Mining with Hierarchical Genetic Algorithms Final Report , 1995 .

[18]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[19]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[20]  Chandrika Kamath,et al.  Using Evolutionary Algorithms to Induce Oblique Decision Trees , 2000, GECCO.

[21]  L. Darrell Whitley,et al.  Cellular Genetic Algorithms , 1993, ICGA.

[22]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[23]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[24]  John J. Grefenstette,et al.  A Parallel Genetic Algorithm , 1987, ICGA.

[25]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[26]  Heinz Mühlenbein,et al.  Parallel Genetic Algorithms, Population Genetics, and Combinatorial Optimization , 1989, Parallelism, Learning, Evolution.

[27]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[28]  Heinz Mühlenbein,et al.  Parallel Genetic Algorithms in Combinatorial Optimization , 1992, Computer Science and Operations Research.