International Conference on Information Fusion ( FUSION ) Automatically Balancing Accuracy and Comprehensibility in Predictive Modeling

One specific problem, when performing predictive modeling, is the tradeoff between accuracy and comprehensibility. When comprehensible models are required, this normally rules out high-accuracy techniques like neural networks and committee machines. Therefore, an automated choice of a standard technique, known to generally produce sufficiently accurate and comprehensible models, would be of great value. In this paper, it is argued that this requirement is met by an ensemble of classifiers, followed by rule extraction. The proposed technique is demonstrated, using an ensemble of common classifiers and our rule extraction algorithm G-REX, on 17 publicly available data sets. The results presented demonstrate that the suggested technique performs very well. More specifically, the ensemble clearly outperforms the individual classifiers regarding accuracy, while the extracted models have accuracy similar to the individual classifiers. The extracted models are, however, significantly more compact than corresponding models created directly from the data set using the standard tool CART; thus providing higher comprehensibility.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[3]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[4]  Lars Niklasson,et al.  Accuracy vs. comprehensibility in data mining models , 2004 .

[5]  Lars Niklasson,et al.  The Truth is In There - Rule Extraction from Opaque Models Using Genetic Programming , 2004, FLAIRS.

[6]  Mark Craven,et al.  Rule Extraction: Where Do We Go from Here? , 1999 .

[7]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[8]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[9]  Jude Shavlik,et al.  THE EXTRACTION OF REFINED RULES FROM KNOWLEDGE BASED NEURAL NETWORKS , 1993 .

[10]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[11]  Lars Niklasson,et al.  Why rule extraction matters , 2004, IASTED Conf. on Software Engineering and Applications.

[12]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[13]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.

[14]  U. Johansson,et al.  Neural networks and rule extraction for prediction and explanation in the marketing domain , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[15]  Edward H. Shortliffe,et al.  Production Rules as a Representation for a Knowledge-Based Consultation Program , 1977, Artif. Intell..