Comparison of a genetic algorithm to grammatical evolution for automated design of genetic programming classification algorithms

Abstract Genetic Programming (GP) is gaining increased attention as an effective method for inducing classifiers for data classification. However, the manual design of a genetic programming classification algorithm is a non-trivial time consuming process. This research investigates the hypothesis that automating the design of a GP classification algorithm for data classification can still lead to the induction of effective classifiers and also reduce the design time. Two evolutionary algorithms, namely, a genetic algorithm (GA) and grammatical evolution (GE) are used to automate the design of GP classification algorithms. The classification performance of the automated designed GP classifiers i.e. GA designed GP classifiers and GE designed GP classifiers are compared to each other and to manually designed GP classifiers on real-world problems. Furthermore, a comparison of the design times of automated design and manual design is also carried out for the same set of problems. The automated designed classifiers were found to outperform manually designed classifiers across problem domains. Automated design time is also found to be less than manual design time. This study revealed that for the considered datasets GE performs better for binary classification while the GA does better for multiclass classification. Overall the results of the study are in support of the hypothesis.

[1]  Jason H. Moore,et al.  Genome-Wide Genetic Analysis Using Genetic Programming: The Critical Need for Expert Knowledge , 2007 .

[2]  María Cristina Riff,et al.  Towards a Method for Automatic Algorithm Configuration: A Design Evaluation Using Tuners , 2014, PPSN.

[3]  Nuno Lourenço,et al.  Evolving evolutionary algorithms , 2012, GECCO '12.

[4]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .

[5]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[6]  Thomas Stützle,et al.  Automatic Algorithm Configuration Based on Local Search , 2007, AAAI.

[7]  John R. Koza,et al.  Concept Formation and Decision Tree Induction Using the Genetic Programming Paradigm , 1990, PPSN.

[8]  C ONG,et al.  Building credit scoring models using genetic programming , 2005, Expert Syst. Appl..

[9]  M. C. Sinclair,et al.  Classification rule mining for automatic credit approval using genetic programming , 2007, 2007 IEEE Congress on Evolutionary Computation.

[10]  Daniel Manrique Gamo,et al.  Grammar-Guided Evolutionary Construction of Bayesian networks , 2011 .

[11]  Athanasios Tsakonas,et al.  Hierarchical classification trees using type-constrained genetic programming , 2002, Proceedings First International IEEE Symposium Intelligent Systems.

[12]  Athanasios Tsakonas,et al.  A comparison of classification accuracy of four genetic programming-evolved intelligent structures , 2006, Inf. Sci..

[13]  Nelishia Pillay,et al.  Automated Design of Genetic Programming Classification Algorithms Using a Genetic Algorithm , 2017, EvoApplications.

[14]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  Libin Hong,et al.  Automated Design of Probability Distributions as Mutation Operators for Evolutionary Programming Using Genetic Programming , 2013, EuroGP.

[16]  Simone A. Ludwig,et al.  Improving genetic programming classification for binary and multiclass datasets , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[17]  A. E. Eiben,et al.  Parameter tuning for configuring and analyzing evolutionary algorithms , 2011, Swarm Evol. Comput..

[18]  Mark Johnston,et al.  Multi-Objective Genetic Programming for object detection , 2010, IEEE Congress on Evolutionary Computation.

[19]  Weimin Xiao,et al.  Evolving accurate and compact classification rules with gene expression programming , 2003, IEEE Trans. Evol. Comput..

[20]  Randal S. Olson,et al.  Automating Biomedical Data Science Through Tree-Based Pipeline Optimization , 2016, EvoApplications.

[21]  Leonardo Vanneschi,et al.  Genetic programming for human oral bioavailability of drugs , 2006, GECCO.

[22]  Madeline Helland,et al.  Field Guide , 2020, Desert Weeds.

[23]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[24]  Abdul Rauf Baig,et al.  DepthLimited crossover in GP for classifier evolution , 2011, Comput. Hum. Behav..

[25]  Andrew H. Sung,et al.  Modeling intrusion detection systems using linear genetic programming approach , 2004 .

[26]  Oliver Kramer Self-Adaptive Heuristics for Evolutionary Computation , 2008, Studies in Computational Intelligence.

[27]  Venu Govindaraju,et al.  Issues in evolving GP based classifiers for a pattern recognition task , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[28]  Nuno Lourenço,et al.  The importance of the learning conditions in hyper-heuristics , 2013, GECCO '13.

[29]  Celia C. Bojarczuk,et al.  Genetic programming for knowledge discovery in chest-pain diagnosis. , 2000, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[30]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[31]  Jano I. van Hemert,et al.  Adapting the Fitness Function in GP for Data Mining , 1999, EuroGP.

[32]  Jorge Tavares,et al.  Automatic Design of Ant Algorithms with Grammatical Evolution , 2012, EuroGP.

[33]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[34]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[35]  Thair Nu Phyu Survey of Classification Techniques in Data Mining , 2009 .

[36]  Hitoshi Iba,et al.  Applied Genetic Programming and Machine Learning , 2009 .

[37]  Xue Zhong Wang,et al.  Inductive data mining based on genetic programming: Automatic generation of decision trees from data for process historical data analysis , 2009, Comput. Chem. Eng..

[38]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[39]  Alex Alves Freitas,et al.  Automatic Design of Decision-Tree Algorithms with Evolutionary Algorithms , 2013, Evolutionary Computation.

[40]  Douglas B. Kell,et al.  Explanatory Analysis of the Metabolome Using Genetic Programming of Simple, Interpretable Rules , 2000, Genetic Programming and Evolvable Machines.

[41]  Mengjie Zhang,et al.  Multiclass Object Classification Using Genetic Programming , 2004, EvoWorkshops.

[42]  Matt J. Aitkenhead,et al.  A co-evolving decision tree classification method , 2008, Expert Syst. Appl..

[43]  Vic Ciesielski,et al.  Representing classification problems in genetic programming , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[44]  Dimitrios Kalles,et al.  Breeding Decision Trees Using Evolutionary Techniques , 2001, ICML.

[45]  Xin Yao,et al.  Cost-sensitive classification with genetic programming , 2005, 2005 IEEE Congress on Evolutionary Computation.

[46]  Nikhil R. Pal,et al.  A novel approach to design classifiers using genetic programming , 2004, IEEE Transactions on Evolutionary Computation.

[47]  Felix Dobslaw,et al.  A parameter-tuning framework for metaheuristics based on design of experiments and artificial neural networks , 2010 .

[48]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[49]  Ender Özcan,et al.  Generation of VNS Components with Grammatical Evolution for Vehicle Routing , 2013, EuroGP.

[50]  Alex Alves Freitas,et al.  A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets , 2004, Artif. Intell. Medicine.

[51]  Enrique Hernández-Lemus,et al.  GPDTI: A Genetic Programming Decision Tree Induction method to find epistatic effects in common complex diseases , 2007, ISMB/ECCB.

[52]  Dirk Van Oudheusden,et al.  Automated Parameterisation of a Metaheuristic for the Orienteering Problem , 2008, Adaptive and Multilevel Metaheuristics.

[53]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[54]  Edward P. K. Tsang,et al.  Simplifying Decision Trees Learned by Genetic Programming , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[55]  Alex Alves Freitas,et al.  LEGAL-tree: a lexicographic multi-objective genetic algorithm for decision tree induction , 2009, SAC '09.

[56]  Graham Kendall,et al.  A Classification of Hyper-heuristic Approaches , 2010 .

[57]  Cristóbal Romero,et al.  Induction of Classification Rules with Grammar-Based Genetic Programming , 2005 .

[58]  P. Angeline An Investigation into the Sensitivity of Genetic Programming to the Frequency of Leaf Selection Duri , 1996 .

[59]  Nhien-An Le-Khac,et al.  Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Datasets , 2016, EvoApplications.

[60]  Michael O'Neill,et al.  Grammatical Evolution: Evolving Programs for an Arbitrary Language , 1998, EuroGP.

[61]  Luis Muñoz,et al.  M3GP - Multiclass Classification with GP , 2015, EuroGP.

[62]  A. Engelbrecht,et al.  Searching the forest: using decision trees as building blocks for evolutionary search in classification databases , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[63]  ABDUL RAUF BAIG,et al.  Review of Classification Using Genetic Programming , 2010 .

[64]  Nelishia Pillay,et al.  A study of fitness functions for data classification using grammatical evolution , 2016, 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech).

[65]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[66]  Krzysztof J. Cios,et al.  Multi-objective genetic programming for feature extraction and data visualization , 2015, Soft Computing.

[67]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[68]  Lothar Thiele,et al.  A Comparison of Selection Schemes Used in Evolutionary Algorithms , 1996, Evolutionary Computation.

[69]  Huimin Zhao,et al.  A multi-objective genetic programming approach to developing Pareto optimal decision trees , 2007, Decis. Support Syst..

[70]  Lino Marques,et al.  Genetic Programming Algorithms for Dynamic Environments , 2016, EvoApplications.

[71]  Antonio J. Rivera,et al.  GP-COACH: Genetic Programming-based learning of COmpact and ACcurate fuzzy rule-based classification systems for High-dimensional problems , 2010, Inf. Sci..

[72]  Frank Hutter,et al.  Automated configuration of algorithms for solving hard computational problems , 2009 .

[73]  Taghi M. Khoshgoftaar,et al.  Genetic programming-based decision trees for software quality classification , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[74]  Zbigniew Michalewicz,et al.  Parameter control in evolutionary algorithms , 1999, IEEE Trans. Evol. Comput..

[75]  Lalit M. Patnaik,et al.  Application of genetic programming for multicategory pattern classification , 2000, IEEE Trans. Evol. Comput..

[76]  Mark Hoogendoorn,et al.  Parameter Control in Evolutionary Algorithms: Trends and Challenges , 2015, IEEE Transactions on Evolutionary Computation.

[77]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[78]  Carlos Ansótegui,et al.  A Gender-Based Genetic Algorithm for the Automatic Configuration of Algorithms , 2009, CP.

[79]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[80]  Mihai Oltean,et al.  Evolving evolutionary algorithms using evolutionary algorithms , 2007, GECCO '07.

[81]  Wolfgang Banzhaf,et al.  Genetic Programming: An Introduction , 1997 .

[82]  Mark Johnston,et al.  Genetic Programming for Classification with Unbalanced Data , 2010, EuroGP.

[83]  S. Archana,et al.  Survey of Classification Techniques in Data Mining , 2014 .

[84]  Z. Bandar,et al.  Genetic algorithm based multiple decision tree induction , 1999, ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378).

[85]  Victor Ciesielski,et al.  Genetic Programming for Multiple Class Object Detection , 1999, Australian Joint Conference on Artificial Intelligence.

[86]  Beatriz García Jiménez,et al.  Genetic Programming for Predicting Protein Networks , 2008, IBERAMIA.

[87]  Gisele L. Pappa,et al.  RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines , 2017, EuroGP.

[88]  Lars Niklasson,et al.  Evolving decision trees using oracle guides , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[89]  Gisele L. Pappa,et al.  Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach , 2009 .

[90]  Randal S. Olson,et al.  Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science , 2016, GECCO.

[91]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .