Multi-objective Classification Rule Mining Using Gene Expression Programming

In this paper, the classification rule-mining problem is considered as a multi-objective problem rather than a uni-objective one. Metrics like predictive accuracy and comprehensibility, used for evaluating a rule can be thought of as different criteria of this problem. Predictive accuracy measures the accuracy of the rules extracted from the dataset where as, comprehensibility is measured by the number of attributes involved in the rule and tries to quantify the understandability of the rule. Using these measures as the objectives of rule mining problem, this paper uses gene expression programming to extract some useful and understandable rule. The discovered rule/knowledge is expressed in the form of IF-THEN high-level statement. Gene expression programming recently been introduced as one of the components of evolutionary algorithms and its attributes like simple linear representation and easy to implement, motivate us to use for mining classification rule with multiple objectives. It is often criticized when applied to classification rule mining with multiple objectives, because of the amount of computational resources it requires. However, we believe that it has a lot of potential to perform global search by exploring a large space. The rule antecedent part may contain different combinations of predictor attributes while the consequent part contains only the goal attribute. The searching process is guided by a fitness function considering both predictive accuracy and comprehensibility. Experiments with several benchmark datasets have generated rules for each class with acceptable predictive accuracy and comprehensibility.

[1]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[2]  Nikolay I. Nikolaev,et al.  Inductive Genetic Programming with Decision Trees , 1998, Intell. Data Anal..

[3]  Justinian Rosca,et al.  Generality versus size in genetic programming , 1996 .

[4]  Vic Ciesielski,et al.  Representing classification problems in genetic programming , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[5]  Stewart W. Wilson,et al.  Learning Classifier Systems, From Foundations to Applications , 2000 .

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  David J. Montana,et al.  Strongly Typed Genetic Programming , 1995, Evolutionary Computation.

[8]  Johannes Fürnkranz,et al.  Round Robin Rule Learning , 2001, ICML.

[9]  John R. Koza,et al.  Concept Formation and Decision Tree Induction Using the Genetic Programming Paradigm , 1990, PPSN.

[10]  Victor J. Rayward-Smith,et al.  Rule Induction Using a Reverse Polish Representation , 1999, GECCO.

[11]  Cândida Ferreira,et al.  Gene Expression Programming: A New Adaptive Algorithm for Solving Problems , 2001, Complex Syst..

[12]  Alex A. Freitas,et al.  Data Mining with Constrained-syntax Genetic Programming: Applications in Medical Data Sets , 2001 .

[13]  Kenneth A. De Jong,et al.  Using genetic algorithms for concept learning , 1993, Machine Learning.

[14]  Sandip Sen,et al.  Using real-valued genetic algorithms to evolve rule sets for classification , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[15]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[16]  Stephen F. Smith,et al.  Flexible Learning of Problem Solving Heuristics Through Adaptive Search , 1983, IJCAI.

[17]  Wolfgang Banzhaf,et al.  Genotype-Phenotype-Mapping and Neutral Variation - A Case Study in Genetic Programming , 1994, PPSN.

[18]  Walter Alden Tackett,et al.  Genetic Programming for Feature Discovery and Image Discrimination , 1993, ICGA.

[19]  Michael O'Neill,et al.  Grammatical Evolution: Evolving Programs for an Arbitrary Language , 1998, EuroGP.

[20]  Mike Livesey,et al.  Distinguishing genotype and phenotype in genetic programming , 1996 .

[21]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[22]  Rajib Mall,et al.  Application of elitist multi-objective genetic algorithm for classification rule generation , 2008, Appl. Soft Comput..

[23]  John H. Holland,et al.  COGNITIVE SYSTEMS BASED ON ADAPTIVE ALGORITHMS1 , 1978 .

[24]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[25]  Satchidananda Dehuri,et al.  Genetic Algorithm for Optimization of Multiple Objectives in Knowledge Discovery from Large Databases , 2008, Multi-Objective Evolutionary Algorithms for Knowledge Discovery from Databases.

[26]  Jano I. van Hemert,et al.  A Comparison of Genetic Programming Variants for Data Classification , 1999, IDA.

[27]  Cezary Z. Janikow,et al.  A knowledge-intensive genetic algorithm for supervised learning , 1993, Machine Learning.

[28]  K. De Jong,et al.  Using Genetic Algorithms for Concept Learning , 2004, Machine Learning.

[29]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[30]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[31]  Alex Alves Freitas,et al.  Discovering comprehensible classification rules by using Genetic Programming: a case study in a medical domain , 1999, GECCO.

[32]  Alex A. Freitas,et al.  A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction , 1997 .

[33]  Satchidananda Dehuri,et al.  Multi-Objective Evolutionary Algorithms for Knowledge Discovery from Databases , 2008, Multi-Objective Evolutionary Algorithms for Knowledge Discovery from Databases.

[34]  William B. Langdon,et al.  Application of Genetic Programming to Induction of Linear Classification Trees , 2000, EuroGP.

[35]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[36]  Lalit M. Patnaik,et al.  Application of genetic programming for multicategory pattern classification , 2000, IEEE Trans. Evol. Comput..