Lossless fitness inheritance in genetic algorithms for decision trees

When genetic algorithms are used to evolve decision trees, key tree quality parameters can be recursively computed and re-used across generations of partially similar decision trees. Simply storing instance indices at leaves is sufficient for fitness to be piecewise computed in a lossless fashion. We show the derivation of the (substantial) expected speedup on two bounding case problems and trace the attractive property of lossless fitness inheritance to the divide-and-conquer nature of decision trees. The theoretical results are supported by experimental evidence.

[1]  Ronald L. Rivest,et al.  Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..

[2]  Marek Kretowski,et al.  Global learning of decision trees by an evolutionary algorithm , 2005, Information Processing and Security Systems.

[3]  Dimitrios Kalles,et al.  Stable Decision Trees: Using Local Anarchy for Efficient Incremental Learning , 2000, Int. J. Artif. Intell. Tools.

[4]  Martin Pelikan,et al.  Fitness Inheritance in the Bayesian Optimization Algorithm , 2004, GECCO.

[5]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[6]  Jason Catlett,et al.  Peepholing: Choosing Attributes Efficiently for Megainduction , 1992, ML.

[7]  Stuart J. Russell,et al.  Decision Theoretic Subsampling for Induction on Large Databases , 1993, ICML.

[8]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[9]  Nikolay I. Nikolaev,et al.  Inductive Genetic Programming with Decision Trees , 1998, Intell. Data Anal..

[10]  Wojciech Szpankowski,et al.  On the Joint Path Length Distribution in Random Binary Trees , 2006 .

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  John R. Koza,et al.  Concept Formation and Decision Tree Induction Using the Genetic Programming Paradigm , 1990, PPSN.

[13]  Marek Kretowski,et al.  An Evolutionary Algorithm for Oblique Decision Tree Induction , 2004, ICAISC.

[14]  Thomas Jansen,et al.  Design and Management of Complex Technical Processes and Systems by Means of Computational Intelligence Methods on Classifications of Fitness Functions on Classifications of Fitness Functions , 2022 .

[15]  Johannes Gehrke,et al.  Data Mining with Decision Trees , 2000, ICDE.

[16]  Byoung-Tak Zhang,et al.  Genetic programming with incremental data inheritance , 1999 .

[17]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[18]  D. Goldberg,et al.  Don't evaluate, inherit , 2001 .

[19]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[20]  William S. Meisel,et al.  A Partitioning Algorithm with Application in Pattern Classification and the Optimization of Decision Trees , 1973, IEEE Transactions on Computers.

[21]  John R. Woodward Complexity and Cartesian Genetic Programming , 2006, EuroGP.

[22]  Riccardo Poli,et al.  Backward-chaining evolutionary algorithms , 2006, Artif. Intell..

[23]  G. E. Naumov NP-completeness of problems of construction of optimal decision trees , 1991 .

[24]  Ronald C. Linton Adapting binary fitness functions in Genetic Algorithms , 2004, ACM-SE 42.

[25]  Giulia Pagallo,et al.  Learning DNF by Decision Trees , 1989, IJCAI.

[26]  Ian Witten,et al.  Data Mining , 2000 .

[27]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[28]  Dimitrios Kalles,et al.  Decision Trees And Domain Knowledge In Pattern Recognition , 1994 .

[29]  Robert E. Smith,et al.  Fitness inheritance in genetic algorithms , 1995, SAC '95.

[30]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[31]  Herman Ehrenburg Improved directed acyclic graph evaluation and the combine operator in genetic programming , 1996 .

[32]  William B. Langdon,et al.  Application of Genetic Programming to Induction of Linear Classification Trees , 2000, EuroGP.

[33]  Yishay Mansour,et al.  Generalization Bounds for Decision Trees , 2000, COLT.

[34]  Xinhua Zhuang,et al.  Piecewise linear classifiers using binary tree structure and genetic algorithm , 1996, Pattern Recognit..

[35]  Simon Handley,et al.  On the use of a directed acyclic graph to represent a population of computer programs , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[36]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[37]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[38]  Mark E. Roberts,et al.  The Effectiveness of Cost Based Subtree Caching Mechanisms in Typed Genetic Programming for Image Segmentation , 2003, EvoWorkshops.

[39]  Dimitrios Kalles,et al.  ANALYZING STUDENT PERFORMANCE IN DISTANCE LEARNING WITH GENETIC ALGORITHMS AND DECISION TREES , 2006, Appl. Artif. Intell..

[40]  Lior Rokach,et al.  Data Mining with Decision Trees - Theory and Applications , 2007, Series in Machine Perception and Artificial Intelligence.

[41]  Lior Rokach,et al.  Genetic algorithm-based feature set partitioning for classification problems , 2008, Pattern Recognit..

[42]  Astro Teller,et al.  Automatically Choosing the Number of Fitness Cases: The Rational Allocation of Trials , 1997 .

[43]  Yaochu Jin,et al.  A comprehensive survey of fitness approximation in evolutionary computation , 2005, Soft Comput..

[44]  Chandrika Kamath,et al.  Inducing oblique decision trees with evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[45]  Dimitrios Kalles,et al.  Breeding Decision Trees Using Evolutionary Techniques , 2001, ICML.

[46]  Shaul Markovitch,et al.  Anytime Learning of Decision Trees , 2007, J. Mach. Learn. Res..

[47]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.