Minerva: Sequential Covering for Rule Extraction

Various benchmarking studies have shown that artificial neural networks and support vector machines often have superior performance when compared to more traditional machine learning techniques. The main resistance against these newer techniques is based on their lack of interpretability: it is difficult for the human analyst to understand the reasoning behind these models' decisions. Various rule extraction (RE) techniques have been proposed to overcome this opacity restriction. These techniques are able to represent the behavior of the complex model with a set of easily understandable rules. However, most of the existing RE techniques can only be applied under limited circumstances, e.g., they assume that all inputs are categorical or can only be applied if the black-box model is a neural network. In this paper, we present Minerva, which is a new algorithm for RE. The main advantage of Minerva is its ability to extract a set of rules from any type of black-box model. Experiments show that the extracted models perform well in comparison with various other rule and decision tree learners.

[1]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[2]  Mark Craven,et al.  Extracting comprehensible models from trained neural networks , 1996 .

[3]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[4]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[5]  Johan A. K. Suykens,et al.  Least squares support vector machine classifiers: a large scale algorithm , 1999 .

[6]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[7]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[8]  Joachim Diederich,et al.  Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets , 2004 .

[9]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[10]  Sebastian Thrun,et al.  Extracting Provably Correct Rules from Artificial Neural Networks , 1993 .

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Paulo J. G. Lisboa,et al.  Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach , 2006, IEEE Transactions on Neural Networks.

[13]  Donald C. Wunsch,et al.  Neural network explanation using inversion , 2007, Neural Networks.

[14]  Bart Baesens,et al.  ITER: An Algorithm for Predictive Regression Rule Extraction , 2006, DaWaK.

[15]  Joydeep Ghosh,et al.  Symbolic Interpretation of Artificial Neural Networks , 1999, IEEE Trans. Knowl. Data Eng..

[16]  Ryszard S. Michalski,et al.  On the Quasi-Minimal Solution of the General Covering Problem , 1969 .

[18]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.

[19]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[20]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[21]  Kazumi Saito,et al.  Law Discovery using Neural Networks , 1997, IJCAI.

[22]  Jacek M. Zurada,et al.  Extraction of rules from artificial neural networks for nonlinear regression , 2002, IEEE Trans. Neural Networks.

[23]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[24]  Johan A. K. Suykens,et al.  Sparse least squares Support Vector Machine classifiers , 2000, ESANN.

[25]  Glenn Fung,et al.  Rule extraction from linear support vector machines , 2005, KDD '05.

[26]  Johan A. K. Suykens,et al.  Multiclass least squares support vector machines , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[27]  Rudy Setiono,et al.  An Approach To Generate Rules From Neural Networks for Regression Problems , 2004, Eur. J. Oper. Res..

[28]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[29]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[30]  Joachim Diederich,et al.  Eclectic Rule-Extraction from Support Vector Machines , 2005 .

[31]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[32]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..