Evolutionary Approach for Automated Discovery of Censored Production Rules

In the recent past, there has been an increasing interest in applying evolutionary methods to Knowledge Discovery in Databases (KDD) and a number of successful applications of Genetic Algorithms (GA) and Genetic Programming (GP) to KDD have been demonstrated. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski & Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations, in which the conditional statement ‘If P Then D’ holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the ‘If P Then D’ part of the CPR expresses important information, while the Unless C part acts only as a switch and changes the polarity of D to ~D. This paper presents a classification algorithm based on evolutionary approach that discovers comprehensible rules with exceptions in the form of CPRs. The proposed approach has flexible chromosome encoding, where each chromosome corresponds to a CPR. Appropriate genetic operators are suggested and a fitness function is proposed that incorporates the basic constraints on CPRs. Experimental results are presented to demonstrate the performance of the proposed algorithm. Keywords—Censored Production Rule, Data Mining, Machine Learning, Evolutionary Algorithms.

[1]  Marek Kretowski,et al.  Discovery of Decision Rules from Databases: An Evolutionary Approach , 1998, PKDD.

[2]  William B. Langdon,et al.  Application of Genetic Programming to Induction of Linear Classification Trees , 2000, EuroGP.

[3]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[4]  Siddhartha Bhattacharyya,et al.  Evolutionary algorithms in data mining: multi-objective performance modeling for direct marketing , 2000, KDD '00.

[5]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[6]  Kamal Kant Bharadwaj,et al.  Hierarchical Censored Production Rules (HCPRs) system , 1992, Data Knowl. Eng..

[7]  Kamal Kant Bharadwaj,et al.  Some learning techniques in hierarchical censored production rules (HCPRs) system , 1998 .

[8]  Alex A. Freitas,et al.  Discovering comprehensible classification rules with a genetic algorithm , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[9]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[10]  Deborah R. Carvalho,et al.  A genetic-algorithm for discovering small-disjunct rules in data mining , 2002, Appl. Soft Comput..

[11]  Wynne Hsu,et al.  Intuitive Representation of Decision Trees Using General Rules and Exceptions , 2000, AAAI/IAAI.

[12]  Alex Alves Freitas,et al.  Discovering interesting knowledge from a science and technology database with a genetic algorithm , 2004, Appl. Soft Comput..

[13]  K. K. Bharadwaj,et al.  Extended Hierarchical Censored Production Rules (EHCPRs) System: An Approach Toward Generalized Knowledge Representation , 1999 .

[14]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .

[15]  Ryszard S. Michalski,et al.  Variable Precision Logic , 1986, Artif. Intell..

[16]  Kamal Kant Bharadwaj,et al.  Hierarchical censored production rules (HCPRs) system employing the dempster-shafer uncertainty calculus , 1994, Inf. Softw. Technol..

[17]  Brian R. Gaines,et al.  Transforming Rules and Trees into Comprehensible Knowledge Structures , 2000 .

[18]  Heikki Mannila,et al.  Learning rules with local exceptions , 1994 .

[19]  Haym Hirsh,et al.  A Quantitative Study of Small Disjuncts , 2000, AAAI/IAAI.

[20]  Wolfgang Banzhaf,et al.  A comparison of linear genetic programming and neural networks in medical data mining , 2001, IEEE Trans. Evol. Comput..

[21]  Hongjun Lu,et al.  Exception Rule Mining with a Relative Interestingness Measure , 2000, PAKDD.

[22]  Fadl Mutaher Muhsen Ba-Alwi Discovery of hierarchical production rules with exceptions , 2004 .

[23]  Jan M. Zytkow,et al.  Unified algorithm for undirected discovery of exception rules , 2005, Int. J. Intell. Syst..

[24]  Arthur Tay,et al.  Mining multiple comprehensible classification rules using genetic programming , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[25]  Kamal K. Bharadwaj,et al.  Genetic Programming Approach to Hierarchical Production Rule Discovery , 2007 .

[26]  Kamal Kant Bharadwaj,et al.  Adaptive Hierarchical Censored Production Rule-Based System: A Generic Algorithm Approach , 1996, SBIA.

[27]  Ahmet Arslan,et al.  Mining of Interesting Prediction Rules with Uniform Two-Level Genetic Algorithm , 2005 .

[28]  Ivanoe De Falco,et al.  Discovering interesting classification rules with genetic programming , 2002, Appl. Soft Comput..

[29]  Brian R. Gaines,et al.  Induction of ripple-down rules applied to modeling large databases , 1995, Journal of Intelligent Information Systems.

[30]  Cezary Z. Janikow,et al.  A knowledge-intensive genetic algorithm for supervised learning , 1993, Machine Learning.

[31]  Kwong-Sak Leung,et al.  Data Mining Using Grammar Based Genetic Programming and Applications , 2000 .

[32]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[33]  K. De Jong,et al.  Using Genetic Algorithms for Concept Learning , 2004, Machine Learning.

[34]  Kamal Kant Bharadwaj,et al.  Bucket brigade algorithm for hierarchical censored production rule-based system , 1996, Int. J. Intell. Syst..

[35]  Patrick D. Surry,et al.  Co-operation through Hierarchical Competition in Genetic Data Mining , 2003 .