The Extraction of Reened Rules from Knowledge-based Neural Networks

Neural networks, despite their empirically-proven abilities, have been little used for the renement of existing knowledge because this task requires a three-step process. First, knowledge in some form must be inserted into a neural network. Second, the network must be re ned. Third, knowledge must be extracted from the network. We have previously described a method for the rst step of this process. Standard neural learning techniques can accomplish the second step. In this paper, we propose and empirically evaluate a method for the nal, and possibly most di cult, step. This method e ciently extracts symbolic rules from trained neural networks. The four major results of empirical tests of this method are that the extracted rules: (1) closely reproduce (and can even exceed) the accuracy of the network from which they are extracted; (2) are superior to the rules produced by methods that directly re ne symbolic rules; (3) are superior to those produced by previous techniques for extracting rules from trained neural networks; (4) are \human comprehensible." Thus, the method demonstrates that neural networks can be an e ective tool for the re nement of symbolic knowledge. Moreover, the rule-extraction technique developed herein contributes to the understanding of how symbolic and connectionist approaches to arti cial intelligence can be pro tably integrated.

[1]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[2]  William M. Smith,et al.  A Study of Thinking , 1956 .

[3]  U. Neisser,et al.  Hierarchies in concept attainment. , 1962, Journal of experimental psychology.

[4]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[5]  Grace Jordison Molecular Biology of the Gene , 1965, The Yale Journal of Biology and Medicine.

[6]  John P. McDermott,et al.  R1: A Rule-Based Configurer of Computer Systems , 1982, Artif. Intell..

[7]  D. K. Hawley,et al.  Compilation and analysis of Escherichia coli promoter DNA sequences. , 1983, Nucleic acids research.

[8]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[9]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[10]  Richard Fozzard,et al.  A Connectionist Expert System that Actually Works , 1988, NIPS.

[11]  R. Nakano,et al.  Medical diagnostic expert system based on PDP model , 1988, IEEE 1988 International Conference on Neural Networks.

[12]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[13]  Terrence J. Sejnowski,et al.  A Parallel Network that Learns to Play Backgammon , 1989, Artif. Intell..

[14]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[15]  Gerald Tesauro,et al.  Neural Network Visualization , 1989, NIPS.

[16]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[17]  M. O'Neill Escherichia coli promoters. I. Consensus as it relates to spacing class, specificity, repeat substructure, and three-dimensional organization. , 1989, The Journal of biological chemistry.

[18]  Richard P. Lippmann,et al.  A Comparative Study of the Practical Characteristics of Neural Network and Conventional Pattern Classifiers , 1990, NIPS 1990.

[19]  Larry A. Rendell,et al.  Feature construction: an analytic framework and an application to decision trees , 1990 .

[20]  Yoichi Hayashi,et al.  A Neural Expert System with Automated Extraction of Fuzzy If-Then Rules , 1990, NIPS.

[21]  Jude W. Shavlik,et al.  Training Knowledge-Based Neural Networks to Recognize Genes , 1990, NIPS.

[22]  Raymond J. Mooney,et al.  Changing the Rules: A Comprehensive Approach to Theory Refinement , 1990, AAAI.

[23]  G. Stormo Consensus patterns in DNA. , 1990, Methods in enzymology.

[24]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[25]  Jack Mostow,et al.  Direct Transfer of Learned Information Among Neural Networks , 1991, AAAI.

[26]  Pat Langley,et al.  Using Background Knowledge in Concept Formation , 1991, ML.

[27]  Michael C. Mozer,et al.  The Connectionist Scientist Game: Rule Extraction and Refinement in a Neural Network , 1991 .

[28]  M. Pazzani,et al.  ID2-of-3: Constructive Induction of M-of-N Concepts for Discriminators in Decision Trees , 1991 .

[29]  Hamid R. Berenji,et al.  Refinement of Approximate Reasoning-based Controllers by Reinforcement Learning , 1991, ML.

[30]  LiMin Fu,et al.  Rule Learning by Searching on Adapted Nets , 1991, AAAI.

[31]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[32]  Geoffrey G. Towell,et al.  Symbolic knowledge and neural networks: insertion, refinement and extraction , 1992 .