Extraction, Insertion and Refinement of Symbolic Rules in Dynamically Driven Recurrent Neural Networks

Abstract Recurrent neural networks readily process, learn and generate temporal sequences. In addition, they have been shown to have impressive computational power. Recurrent neural networks can be trained with symbolic string examples encoded as temporal sequences to behave like sequential finite slate recognizers. We discuss methods for extracting, inserting and refining symbolic grammatical rules for recurrent networks. This paper discusses various issues: how rules are inserted into recurrent networks, how they affect training and generalization, and how those rules can be checked and corrected. The capability of exchanging information between a symbolic representation (grammatical rules)and a connectionist representation (trained weights) has interesting implications. After partially known rules are inserted, recurrent networks can be trained to preserve inserted rules that were correct and to correct through training inserted rules that were ‘incorrec’—rules inconsistent with the training data.

[1]  Srimat T. Chakradhar,et al.  First-order versus second-order single-layer recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[2]  Stephen I. Gallant,et al.  Neural network learning and expert systems , 1993 .

[3]  Yaser S. Abu-Mostafa,et al.  Learning from hints in neural networks , 1990, J. Complex..

[4]  Jerome A. Feldman,et al.  Learning Automata from Ordered Examples , 1991, Mach. Learn..

[5]  Hamid R. Berenji,et al.  Refinement of Approximate Reasoning-based Controllers by Reinforcement Learning , 1991, ML.

[6]  Hava T. Siegelmann,et al.  On the computational power of neural nets , 1992, COLT '92.

[7]  Kevin J. Lang Random DFA's can be approximately learned from sparse uniform examples , 1992, COLT '92.

[8]  Michael C. Mozer,et al.  Discovering the Structure of a Reactive Environment by Exploration , 1990, Neural Computation.

[9]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[10]  C. L. Giles,et al.  Machine learning using higher order correlation networks , 1986 .

[11]  Simon M. Lucas,et al.  Syntactic Neural Networks , 1990 .

[12]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[13]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[14]  C. L. Giles,et al.  Inserting rules into recurrent neural networks , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[15]  Baldi,et al.  Number of stable points for spin-glasses and neural networks of higher orders. , 1987, Physical review letters.

[16]  King-Sun Fu,et al.  Syntactic Pattern Recognition And Applications , 1968 .

[17]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[18]  Scott E. Fahlman,et al.  The Recurrent Cascade-Correlation Architecture , 1990, NIPS.

[19]  Giovanni Soda,et al.  An unified approach for integrating explicit knowledge and learning by example in recurrent networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[20]  Michael A. Harrison,et al.  Introduction to formal language theory , 1978 .

[21]  Isabelle Guyon,et al.  High-order neural networks: information storage without errors , 1987 .

[22]  Stephen José Hanson,et al.  What connectionist models learn: Learning and representation in connectionist networks , 1990, Behavioral and Brain Sciences.

[23]  C. Lee Giles,et al.  Extracting and Learning an Unknown Grammar with Recurrent Neural Networks , 1991, NIPS.

[24]  C. Lee Giles,et al.  Experimental Comparison of the Effect of Order in Recurrent Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[25]  Robert B. Allen,et al.  Connectionist Language Users , 1990 .

[26]  Colin Giles,et al.  Learning, invariance, and generalization in high-order neural networks. , 1987, Applied optics.

[27]  Steven C. Suddarth,et al.  Symbolic-Neural Systems and the Use of Hints for Developing Complex Systems , 1991, Int. J. Man Mach. Stud..

[28]  Eduardo Sontag,et al.  Turing computability with neural nets , 1991 .

[29]  Alberto Sanfeliu,et al.  Understanding Neural Networks for Grammatical Inference and Recognition , 1993 .

[30]  Yoh-Han Pao,et al.  Adaptive pattern recognition and neural networks , 1989 .

[31]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[32]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[33]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[34]  C. L. Giles,et al.  Heuristics for the extraction of rules from discrete-time recurrent neural networks , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[35]  C. Lee Giles,et al.  Pruning recurrent neural networks for improved generalization performance , 1994, IEEE Trans. Neural Networks.

[36]  I. Noda,et al.  A learning method for recurrent networks based on minimization of finite automata , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[37]  Raymond J. Mooney,et al.  Changing the Rules: A Comprehensive Approach to Theory Refinement , 1990, AAAI.

[38]  Allen Ginsberg,et al.  Theory Revision via Prior Operationalization , 1988, AAAI.

[39]  C. L. Giles,et al.  Constructive learning of recurrent neural networks , 1993, IEEE International Conference on Neural Networks.

[40]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[41]  C. Lee Giles,et al.  Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[42]  Irving S. Reed,et al.  Including Hints in Training Neural Nets , 1991, Neural Computation.

[43]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[44]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[45]  Demetri Psaltis,et al.  Higher order associative memories and their optical implementations , 1988, Neural Networks.

[46]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[47]  C. Lee Giles,et al.  Training Second-Order Recurrent Neural Networks using Hints , 1992, ML.

[48]  Raymond L. Watrous,et al.  Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.

[49]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.