Hierarchical classification trees using type-constrained genetic programming

We investigate the capability of the genetic programming approach for producing hierarchical, rule-based, classification trees. These trees can be seen as an extension to the machine learning decision trees concept, where the predicates here can be complex expressions rather than just simple attribute-value comparisons. In order to improve the search ability and to produce meaningful results, type-constraints are applied to the genetic programming procedure, expressed in a BNF grammar. The model is tested in two well-known domains. In the Balance-Scale data, the system achieves in revealing the data creation rule. In the E-Coli Protein Localization Sites data, the system realizes a competitor to the literature classification score, retaining the solution comprehensibility. The training procedure is guided by an adaptive fitness measure. The overall performance of this system denotes its competitiveness to standard computational intelligent procedures.

[1]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[2]  David J. Montana,et al.  Strongly Typed Genetic Programming , 1995, Evolutionary Computation.

[3]  R. Siegler Three aspects of cognitive development , 1976, Cognitive Psychology.

[4]  John R. Koza,et al.  Genetic programming 2 - automatic discovery of reusable programs , 1994, Complex Adaptive Systems.

[5]  John R. Koza,et al.  Genetic Programming II , 1992 .

[6]  B. Bjerregaard,et al.  Genetic Programming for the Generation of Crisp and Fuzzy Rule Bases in Classification and Diagnosis of Medical Data , 2002 .

[7]  D. Nauck,et al.  Nefclass | a Neuro{fuzzy Approach for the Classification of Data , 1995 .

[8]  Frédéric Gruau,et al.  On using syntactic constraints with genetic programming , 1996 .

[9]  Una-May O'Reilly,et al.  Genetic Programming II: Automatic Discovery of Reusable Programs. , 1994, Artificial Life.

[10]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[11]  L. Shulman,et al.  Medical Problem Solving: An Analysis of Clinical Reasoning , 1978 .

[12]  Peter A. Whigham,et al.  Search bias, language bias and genetic programming , 1996 .

[13]  Celia C. Bojarczuk,et al.  Genetic programming for knowledge discovery in chest-pain diagnosis. , 2000, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[14]  Enrique Alba,et al.  Type-constrained genetic programming for rule-base definition in fuzzy logic controllers , 1996 .

[15]  Michael O'Neill,et al.  Grammatical Evolution: Evolving Programs for an Arbitrary Language , 1998, EuroGP.

[16]  Paul Horton,et al.  A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins , 1996, ISMB.