Representation of finite state automata in Recurrent Radial Basis Function networks

In this paper, we propose some techniques for injecting finite state automata into Recurrent Radial Basis Function networks (R2BF). When providing proper hints and constraining the weight space properly, we show that these networks behave as automata. A technique is suggested for forcing the learning process to develop automata representations that is based on adding a proper penalty function to the ordinary cost. Successful experimental results are shown for inductive inference of regular grammars.

[1]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[2]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[3]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[4]  J. Elman Distributed Representations, Simple Recurrent Networks, And Grammatical Structure , 1991 .

[5]  Marco Gori,et al.  On the problem of local minima in recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[6]  Giovanni Soda,et al.  Unified Integration of Explicit Knowledge and Learning by Example in Recurrent Networks , 1995, IEEE Trans. Knowl. Data Eng..

[7]  Yoshua Bengio,et al.  Learning the dynamic nature of speech with back-propagation for sequences , 1992, Pattern Recognit. Lett..

[8]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[9]  Yves Robert,et al.  Automata networks in computer science : theory and applications , 1987 .

[10]  Irving S. Reed,et al.  Including Hints in Training Neural Nets , 1991, Neural Computation.

[11]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[12]  C. Lee Giles,et al.  Extracting and Learning an Unknown Grammar with Recurrent Neural Networks , 1991, NIPS.

[13]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[14]  Raymond L. Watrous,et al.  Connected recognition with a recurrent network , 1990, Speech Commun..

[15]  Raymond L. Watrous,et al.  Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.

[16]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[17]  Paolo Frasconi,et al.  Multilayered networks and the C-G uncertainty principle , 1993, Defense, Security, and Sensing.

[18]  C. L. Giles,et al.  Constructing deterministic finite-state automata in sparse recurrent neural networks , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[19]  Giovanni Soda,et al.  An unified approach for integrating explicit knowledge and learning by example in recurrent networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[20]  Padhraic Smyth,et al.  Self-clustering recurrent networks , 1993, IEEE International Conference on Neural Networks.

[21]  Michael C. Mozer,et al.  A Unified Gradient-Descent/Clustering Architecture for Finite State Machine Induction , 1993, NIPS.

[22]  C. Lee Giles,et al.  Experimental Comparison of the Effect of Order in Recurrent Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[23]  J. Kolen Recurrent Networks: State Machines Or Iterated Function Systems? , 1994 .

[24]  Paulo J. G. Lisboa,et al.  Translation, rotation, and scale invariant pattern recognition by high-order neural networks and moment classifiers , 1992, IEEE Trans. Neural Networks.

[25]  Yaser S. Abu-Mostafa,et al.  Learning from hints in neural networks , 1990, J. Complex..

[26]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[27]  Giovanni Soda,et al.  Projecting Sub-symbolic Onto Symbolic Representations in Artificial Neural Networks , 1993, AI*IA.

[28]  M. Morris Mano,et al.  Computer Engineering: Hardware Design , 1988 .

[29]  R. Watrous,et al.  Synthesize, optimize, analyze, repeat (SOAR): Application of neural network tools to ECG patient monitoring , 1995 .

[30]  Jude W. Shavlik,et al.  Combining Symbolic and Neural Learning , 1994, Machine Learning.

[31]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[32]  C. Lee Giles,et al.  Training Second-Order Recurrent Neural Networks using Hints , 1992, ML.

[33]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[34]  Xiao-Hu Yu,et al.  Can backpropagation error surface not have local minima , 1992, IEEE Trans. Neural Networks.

[35]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[36]  Eric Goles,et al.  Neural and automata networks , 1990 .

[37]  J. Pollack The Induction of Dynamical Recognizers , 1996, Machine Learning.

[38]  J. Shavlik Combining symbolic and neural learning , 2004, Machine Learning.

[39]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[40]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[41]  Giovanni Soda,et al.  Recurrent neural networks and prior knowledge for sequence processing: a constrained nondeterministic approach , 1995, Knowl. Based Syst..

[42]  Jude Shavlik,et al.  THE EXTRACTION OF REFINED RULES FROM KNOWLEDGE BASED NEURAL NETWORKS , 1993 .

[43]  Alberto Tesi,et al.  On the Problem of Local Minima in Backpropagation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  James L. McClelland,et al.  Graded state machines: The representation of temporal contingencies in simple recurrent networks , 1991, Machine Learning.

[45]  Yann LeCun,et al.  Generalization and network design strategies , 1989 .

[46]  Sontag,et al.  Backpropagation separates when perceptrons do , 1989 .

[47]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[48]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[49]  James L. McClelland,et al.  Graded state machines: the representation of temporal contingencies in feedback networks , 1995 .