Extracting Rules From Neural Networks as Decision Diagrams

Rule extraction from neural networks (NNs) solves two fundamental problems: it gives insight into the logic behind the network and in many cases, it improves the network's ability to generalize the acquired knowledge. This paper presents a novel eclectic approach to rule extraction from NNs, named LOcal Rule Extraction (LORE), suited for multilayer perceptron networks with discrete (logical or categorical) inputs. The extracted rules mimic network behavior on the training set and relax this condition on the remaining input space. First, a multilayer perceptron network is trained under standard regime. It is then transformed into an equivalent form, returning the same numerical result as the original network, yet being able to produce rules generalizing the network output for cases similar to a given input. The partial rules extracted for every training set sample are then merged to form a decision diagram (DD) from which logic rules can be extracted. A rule format explicitly separating subsets of inputs for which an answer is known from those with an undetermined answer is presented. A special data structure, the decision diagram, allowing efficient partial rule merging is introduced. With regard to rules' complexity and generalization abilities, LORE gives results comparable to those reported previously. An algorithm transforming DDs into interpretable boolean expressions is described. Experimental running times of rule extraction are proportional to the network's training time.

[1]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[2]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[3]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[4]  LiMin Fu,et al.  Rule Generation from Neural Networks , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[5]  David W. Opitz,et al.  Dynamically adding symbolically meaningful nodes to knowledge-based neural networks , 1995, Knowl. Based Syst..

[6]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[7]  Ron Kohavi,et al.  Oblivious Decision Trees, Graphs, and Top-Down Pruning , 1995, IJCAI.

[8]  Mark Craven,et al.  Extracting comprehensible models from trained neural networks , 1996 .

[9]  Jacek M. Zurada,et al.  Perturbation method for deleting redundant inputs of perceptron networks , 1997, Neurocomputing.

[10]  H. Andersen An Introduction to Binary Decision Diagrams , 1997 .

[11]  Joachim Diederich,et al.  The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks , 1998, IEEE Trans. Neural Networks.

[12]  R. Krishnan,et al.  A search technique for rule extraction from trained neural networks , 1999, Pattern Recognit. Lett..

[13]  Alan B. Tickle The truth is in there : directions and challenges in extracting rules from trained ar tificial neural networks , 2001 .

[14]  Amir Pnueli,et al.  The ROBDD Size of Simple CNF Formulas , 2003, CHARME.

[15]  Bart Baesens,et al.  Using Neural Network Rule Extraction and Decision Tables for Credit - Risk Evaluation , 2003, Manag. Sci..

[16]  Bart Baesens,et al.  Decision Diagrams in Machine Learning: An Empirical Study on Real-Life Credit-Risk Data , 2004, Diagrams.

[17]  Alberto L. Sangiovanni-Vincentelli,et al.  Using the minimum description length principle to infer reduced ordered decision graphs , 1996, Machine Learning.

[18]  Joachim Diederich,et al.  Eclectic Rule-Extraction from Support Vector Machines , 2005 .

[19]  Paulo J. G. Lisboa,et al.  Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach , 2006, IEEE Transactions on Neural Networks.

[20]  Bart Baesens,et al.  Recursive Neural Network Rule Extraction for Data With Mixed Attributes , 2008, IEEE Transactions on Neural Networks.

[21]  Zhi-Hua Zhou,et al.  Rule extraction: Using neural networks or for neural networks? , 2004, Journal of Computer Science and Technology.

[22]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.