Multi-objective competitive coevolution for efficient GP classifier problem decomposition

A novel approach to the classification of large and unbalanced multi-class data sets is presented where the widely acknowledged issues of scalability, solution transparency, and problem decomposition are addressed simultaneously within the context of the genetic programming (GP) paradigm. A cooperative coevolutionary training environment that employs multi-objective evaluation provides the basis for problem decomposition and reduced solution complexity, while scalability is achieved through a Pareto competitive coevolutionary framework, allowing the system to be applied to large data sets (tens or hundreds of thousands of exemplars) without recourse to hardware-specific speedups. Moreover, a key departure from the canonical GP approach to classification is utilized in which the output of GP is expressed in terms of a non-binary, local membership function (e.g. a Gaussian), where it is no longer necessary for an expression to represent an entire class. Decomposition is then achieved through reformulating the classification problem as one of cluster consistency, where an appropriate subset of the training patterns can be associated with each individual such that problems are solved by several specialist classifiers rather than by a single 'super' individual.

[1]  S. T. Sarasamma,et al.  Hierarchical Kohonenen net for anomaly detection in network security , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Rajeev Kumar,et al.  Improved Sampling of the Pareto-Front in Multiobjective Genetic Optimizations by Steady-State Evolution: A Pareto Converging Genetic Algorithm , 2002, Evolutionary Computation.

[3]  Edwin D. de Jong,et al.  The Incremental Pareto-Coevolution Archive , 2004, GECCO.

[4]  Andrew R. McIntyre,et al.  MOGE: GP classification problem decomposition using multi-objective optimization , 2006, GECCO '06.

[5]  Alan Blair,et al.  A structure preserving crossover in grammatical evolution , 2005, 2005 IEEE Congress on Evolutionary Computation.

[6]  J. K. Kinnear,et al.  Advances in Genetic Programming , 1994 .

[7]  Giandomenico Spezzano,et al.  Improving cooperative GP ensemble with clustering and pruning for pattern classification , 2006, GECCO.

[8]  Lalit M. Patnaik,et al.  Application of genetic programming for multicategory pattern classification , 2000, IEEE Trans. Evol. Comput..

[9]  Michael O'Neill,et al.  Grammatical evolution - evolutionary automatic programming in an arbitrary language , 2003, Genetic programming.

[10]  Jordan B. Pollack,et al.  Pareto Optimality in Coevolutionary Learning , 2001, ECAL.

[11]  Malcolm I. Heywood,et al.  Training genetic programming on half a million patterns: an example from anomaly detection , 2005, IEEE Transactions on Evolutionary Computation.

[12]  Vic Ciesielski,et al.  Representing classification problems in genetic programming , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[13]  Peter Ross,et al.  Dynamic Training Subset Selection for Supervised Learning in Genetic Programming , 1994, PPSN.

[14]  N. Japkowicz Why Question Machine Learning Evaluation Methods ? ( An illustrative review of the shortcomings of current methods ) , 2006 .

[15]  Vasant Honavar,et al.  Proceedings of the Genetic and Evolutionary Computation Conference , 2021, GECCO.

[16]  Malcolm I. Heywood,et al.  Pareto-coevolutionary genetic programming classifier , 2006, GECCO.

[17]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[18]  Xiaodong Li,et al.  Multi-objective techniques in genetic programming for evolving classifiers , 2005, 2005 IEEE Congress on Evolutionary Computation.

[19]  John R. Koza,et al.  Building a Parallel Computer System for $18, 000 that Performs a Half Peta-Flop per Day , 1999, GECCO.

[20]  Malcolm I. Heywood,et al.  Towards Efficient Training on Large Datasets for Genetic Programming , 2004, Canadian AI.

[21]  Edwin D. de Jong,et al.  Reducing bloat and promoting diversity using multi-objective methods , 2001 .

[22]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..