The problem of extraction of crisp logical rules from neural networks trained with a backpropagation algorithm is solved by smooth transformation of these networks into simpler networks performing logical functions. Two constraints are included in the cost function: a regularization term inducing weight decay, and an additional term forcing the remaining weights to /spl plusmn/1 integer values. Networks with minimal number of connections are created, leading to a small number of crisp logical rules. A constructive algorithm is proposed, in which rules are generated consecutively by adding more nodes to the network. Rules that are most general, covering many training examples, are created first, followed by more specific rules, covering a few cases only. This constructive algorithm applied to the iris classification problem generates two rules with three antecedents giving 98.7% accuracy. A single rule for the mushroom problem leads to 98.52% accuracy while three additional rules allow for perfect classification. The rules found for the three monk problems classify all examples correctly.
[1]
Sholom M. Weiss,et al.
An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods
,
1989,
IJCAI.
[2]
Jude W. Shavlik,et al.
in Advances in Neural Information Processing
,
1996
.
[3]
Włodzisław Duch,et al.
Constrained backpropagation for feature selection and extraction of logical rules
,
1996
.
[4]
David J. C. MacKay,et al.
A Practical Bayesian Framework for Backpropagation Networks
,
1992,
Neural Computation.
[5]
Joachim Diederich,et al.
Survey and critique of techniques for extracting rules from trained artificial neural networks
,
1995,
Knowl. Based Syst..
[6]
Jacek M. Zurada,et al.
Introduction to artificial neural systems
,
1992
.
[7]
Wlodzislaw Duch,et al.
Feature space mapping as a universal adaptive system
,
1995
.
[8]
Christopher J. Merz,et al.
UCI Repository of Machine Learning Databases
,
1996
.