Extracting -of- Rules from Trained Neural Networks

An effective algorithm for extracting -of- rules from trained feedforward neural networks is proposed. Two com- ponents of the algorithm distinguish our method from previously proposed algorithms which extract symbolic rules from neural net- works. First, we train a network where each input of the data can only have one of the two possible values, 1 or one. Second, we apply the hyperbolic tangent function to each connection from the input layer to the hidden layer of the network. By applying this squashing function, the activation values at the hidden units are ef- fectively computed as the hyperbolic tangent (or the sigmoid) of the weighted inputs, where the weights have magnitudes that are equal one. By restricting the inputs and the weights to binary values ei- ther 1 or one, the extraction of -of- rules from the networks becomes trivial. We demonstrate the effectiveness of the proposed algorithm on several widely tested datasets. For datasets consisting of thousands of patterns with many attributes, the rules extracted by the algorithm are surprisingly simple and accurate. Index Terms—DNF rules, feedforward neural networks, -of- rules, network pruning, rule extraction.

[1]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[2]  J. C. Schlimmer,et al.  Concept acquisition through representational adjustment , 1987 .

[3]  Raymond L. Watrous Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization , 1988 .

[4]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[6]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[7]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[8]  Arjen van Ooyen,et al.  Improving the convergence of the back-propagation algorithm , 1992, Neural Networks.

[9]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[10]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[11]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[12]  Rudy Setiono A Neural Network Construction Algorithm which Maximizes the Likelihood Function , 1995, Connect. Sci..

[13]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[14]  Huan Liu,et al.  X2R: a fast rule generator , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[15]  Huan Liu,et al.  Symbolic Representation of Neural Networks , 1996, Computer.

[16]  Rudy Setiono,et al.  A Penalty-Function Approach for Pruning Feedforward Neural Networks , 1997, Neural Computation.