Simplifying Decision Trees Learned by Genetic Programming

This work is motivated by financial forecasting using genetic programming. This paper presents a method to post-process decision trees. The processing procedure is based on the analysis and evaluation of the components of each tree, followed by pruning. The idea behind this approach is to identify and eliminate rules that cause misclassification. As a result we expect to keep and generate rules that enhance the classification. This method was tested on decision trees generated by a genetic program whose aim was to discover classification rules in financial stock markets. From experimental results we can conclude that our method is able to improve the accuracy and precision of the classification.

[1]  J. Ross Quinlan,et al.  Simplifying decision trees , 1987, Int. J. Hum. Comput. Stud..

[2]  Noam Chomsky,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[3]  Peter J. Angeline,et al.  Genetic programming and emergent intelligence , 1994 .

[4]  C. Goose,et al.  Glossary of Terms , 2004, Machine Learning.

[5]  S. Markose,et al.  CHANCE DISCOVERY IN STOCK INDEX OPTION AND FUTURES ARBITRAGE , 2005 .

[6]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[7]  Walter A. Kosters,et al.  Detecting and Pruning Introns for Faster Decision Tree Evolution , 2004, PPSN.

[8]  Terence Soule,et al.  Effects of Code Growth and Parsimony Pressure on Populations in Genetic Programming , 1998, Evolutionary Computation.

[9]  Edward Tsang,et al.  EDDIE beats the bookies , 1998 .

[10]  Jin Li,et al.  EDDIE-Automation, a decision support tool for financial forecasting , 2004, Decis. Support Syst..

[11]  William B. Langdon,et al.  Quadratic Bloat in Genetic Programming , 2000, GECCO.

[12]  Riccardo Poli,et al.  A Simple but Theoretically-Motivated Method to Control Bloat in Genetic Programming , 2003, EuroGP.

[13]  Astro Teller,et al.  Neural Programming and an Internal Reinforcement Policy , 1996 .

[14]  Hitoshi Iba,et al.  Genetic programming using a minimum description length principle , 1994 .

[15]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[16]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[17]  Terence Soule,et al.  Code growth in genetic programming , 1996 .

[18]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[19]  Riccardo Poli,et al.  Fitness Causes Bloat , 1998 .

[20]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[21]  T. Soule,et al.  Code Size and Depth Flows in Genetic Programming , 1997 .

[22]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[23]  Peter J. Angeline,et al.  Explicitly Defined Introns and Destructive Crossover in Genetic Programming , 1996 .