A new approach for measuring rule set consistency

Various algorithms are capable of learning a set of classification rules from a number of observations with their corresponding class labels. Whereas the obtained rule set is usually evaluated by measuring its accuracy on a number of unseen examples, there are several other evaluation criteria, such as comprehensibility and consistency, that are often overlooked. In this paper we focus on the aspect of consistency: if a rule learner is applied several times on the same data set, will it provide rule sets that are similar over the different runs? A new measure is proposed and various examples show how this new measure can be used to decide between different algorithms and rule sets or to find out whether the rules in a knowledge base need to be updated.

[1]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[2]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[3]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[4]  Jude W. Shavlik,et al.  Extracting Refined Rules from Knowledge-Based Neural Networks , 1993, Machine Learning.

[5]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[6]  L. Thomas A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers , 2000 .

[7]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[8]  Jude Shavlik,et al.  THE EXTRACTION OF REFINED RULES FROM KNOWLEDGE BASED NEURAL NETWORKS , 1993 .

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Jan Vanthienen,et al.  An Illustration of Verification and Validation in the Modelling Phase of KBS Development , 1998, Data Knowl. Eng..

[11]  Bruce L. Golden,et al.  Clustering Rules Using Empirical Similarity of Support Sets , 2001, Discovery Science.

[12]  Bart Baesens,et al.  ITER: An Algorithm for Predictive Regression Rule Extraction , 2006, DaWaK.

[13]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[14]  Ian Witten,et al.  Data Mining , 2000 .

[15]  Bart Baesens,et al.  Building Credit-Risk Evaluation Expert Systems Using Neural Network Rule Extraction and Decision Tables , 2001, ICIS.

[16]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[19]  U. Johansson,et al.  International Conference on Information Fusion ( FUSION ) Automatically Balancing Accuracy and Comprehensibility in Predictive Modeling , 2006 .