ANN-DT: an algorithm for extraction of decision trees from artificial neural networks

Although artificial neural networks can represent a variety of complex systems with a high degree of accuracy, these connectionist models are difficult to interpret. This significantly limits the applicability of neural networks in practice, especially where a premium is placed on the comprehensibility or reliability of systems. A novel artificial neural-network decision tree algorithm (ANN-DT) is therefore proposed, which extracts binary decision trees from a trained neural network. The ANN-DT algorithm uses the neural network to generate outputs for samples interpolated from the training data set. In contrast to existing techniques, ANN-DT can extract rules from feedforward neural networks with continuous outputs. These rules are extracted from the neural network without making assumptions about the internal structure of the neural network or the features of the data. A novel attribute selection criterion based on a significance analysis of the variables on the neural-network output is examined. It is shown to have significant benefits in certain cases when compared with the standard criteria of minimum weighted variance over the branches. In three case studies the ANN-DT algorithm compared favorably with CART, a standard decision tree algorithm.

[1]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[2]  Ishwar K. Sethi Neural implementation of tree classifiers , 1995, IEEE Trans. Syst. Man Cybern..

[3]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[4]  Jude W. Shavlik,et al.  Using Sampling and Queries to Extract Rules from Trained Neural Networks , 1994, ICML.

[5]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[6]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[7]  Masaki Yamamoto,et al.  Reorganizing knowledge in neural networks: an explanatory mechanism for neural networks in data classification problems , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[8]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[9]  R. Nakano,et al.  Medical diagnostic expert system based on PDP model , 1988, IEEE 1988 International Conference on Neural Networks.

[10]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[11]  LiMin Fu,et al.  Rule Learning by Searching on Adapted Nets , 1991, AAAI.

[12]  Edward H. Shortliffe,et al.  Production Rules as a Representation for a Knowledge-Based Consultation Program , 1977, Artif. Intell..

[13]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[14]  Timothy Masters,et al.  Practical neural network recipes in C , 1993 .

[15]  Sebastian Thrun,et al.  Extracting Rules from Artifical Neural Networks with Distributed Representations , 1994, NIPS.

[16]  M. Pazzani,et al.  ID2-of-3: Constructive Induction of M-of-N Concepts for Discriminators in Decision Trees , 1991 .

[17]  Chris Aldrich,et al.  Combinatorial evolution of regression nodes in feedforward neural networks , 1999, Neural Networks.

[18]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[19]  Stephen I. Gallant,et al.  Neural network learning and expert systems , 1993 .

[20]  Joachim Diederich,et al.  Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[21]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .