Recurrent Neural Networks

This chapter presents an introduction to recurrent neural networks for readers familiar with artificial neural networks in general, and multi-layer perceptrons trained with gradient descent algorithms (back-propagation) in particular. A recurrent neural network (RNN) is an artificial neural network with internal loops. These internal loops induce recursive dynamics in the networks and thus introduce delayed activation dependencies across the processing elements (PEs) in the network.

[1]  Jeffrey L. Elman,et al.  Distributed Representations, Simple Recurrent Networks, and Grammatical Structure , 1991, Mach. Learn..

[2]  Amir F. Atiya Learning on a General Network , 1987, NIPS.

[3]  C. Lee Giles,et al.  Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.

[4]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[5]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[6]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[7]  C.E. Shannon,et al.  Communication in the Presence of Noise , 1949, Proceedings of the IRE.

[8]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[9]  A. N. Jha,et al.  Nonlinear System Identification using Neural Networks , 2007 .

[10]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[11]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[12]  C. Lee Giles,et al.  Learning a class of large finite state machines with a recurrent neural network , 1995, Neural Networks.

[13]  Padhraic Smyth,et al.  Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.

[14]  Steven W. Smith,et al.  The Scientist and Engineer's Guide to Digital Signal Processing , 1997 .

[15]  Garrison W. Cottrell,et al.  A technique for adapting to speech rate , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.

[16]  Feng Qian,et al.  Identification and control of nonlinear systems by a time-delay recurrent neural network , 2009, Neurocomputing.

[17]  Sepp Hochreiter,et al.  Untersuchungen zu dynamischen neuronalen Netzen , 1991 .

[18]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[19]  Michael C. Mozer,et al.  A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..

[20]  Mikel L. Forcada,et al.  Recursive Hetero-associative Memories for Translation , 1997, IWANN.

[21]  Fakhri Karray,et al.  Soft Computing and Tools of Intelligent Systems Design: Theory and Applications , 2004 .

[22]  Tomás Oliveira e Silva,et al.  Laguerre Filters: An Introduction , 1995 .

[23]  Ah Chung Tsoi,et al.  FIR and IIR Synapses, a New Neural Network Architecture for Time Series Modeling , 1991, Neural Computation.

[24]  Ah Chung Tsoi,et al.  Locally recurrent globally feedforward networks: a critical review of architectures , 1994, IEEE Trans. Neural Networks.

[25]  Jürgen Schmidhuber,et al.  A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks , 1992, Neural Computation.

[26]  S. Renals,et al.  A study of network dynamics , 1990 .

[27]  U. Harigopal,et al.  Grammatical inference using higher order recurrent neural networks , 1993, 1993 (25th) Southeastern Symposium on System Theory.

[28]  Iickho Song,et al.  Identification of Finite State Automata With a Class of Recurrent Neural Networks , 2010, IEEE Transactions on Neural Networks.

[29]  Michael C. Mozer,et al.  Induction of Multiscale Temporal Structure , 1991, NIPS.

[30]  John F. Kolen,et al.  Field Guide to Dynamical Recurrent Networks , 2001 .

[31]  C. L. Giles,et al.  Second-order recurrent neural networks for grammatical inference , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[32]  Pastora Vega,et al.  State space neural network. Properties and application , 1998, Neural Networks.

[33]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[34]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[35]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[36]  Jyh-Ming Kuo,et al.  Adaptation of memory depth in the gamma filter , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[37]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[38]  Eduardo Sontag,et al.  Turing computability with neural nets , 1991 .

[39]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[40]  C. L. Giles,et al.  Dynamic recurrent neural networks: Theory and applications , 1994, IEEE Trans. Neural Networks Learn. Syst..

[41]  Yasuaki Kuroe Representation and Identification of Finite State Automata by Recurrent Neural Networks , 2004, ICONIP.

[42]  M. Gori,et al.  BPS: a learning algorithm for capturing the dynamic nature of speech , 1989, International 1989 Joint Conference on Neural Networks.

[44]  C. Lee Giles,et al.  Constructing deterministic finite-state automata in recurrent neural networks , 1996, JACM.

[45]  Robert B. Allen,et al.  Learning of stable states in stochastic asymmetric networks , 1990, IEEE Trans. Neural Networks.

[46]  Malur K. Sundareshan,et al.  Identification and decentralized adaptive control using dynamical neural networks with application to robotic manipulators , 1993, IEEE Trans. Neural Networks.

[47]  Michael I. Jordan Supervised learning and systems with excess degrees of freedom , 1988 .

[48]  Hock Chuan Chua,et al.  Grammatical Inference using an Adaptive Recurrent Neural Network , 1998, Neural Processing Letters.

[49]  José Carlos Príncipe,et al.  The gamma-filter-a new class of adaptive IIR filters with restricted feedback , 1993, IEEE Trans. Signal Process..

[50]  Padhraic Smyth,et al.  Learning Finite State Machines With Self-Clustering Recurrent Networks , 1993, Neural Computation.

[51]  Krzysztof Patan Locally Recurrent Neural Networks , 2008 .

[52]  Giovanni Soda,et al.  Local Feedback Multilayered Networks , 1992, Neural Computation.

[53]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[54]  Geoffrey E. Hinton,et al.  OPTIMAL PERCEPTUAL INFERENCE , 1983 .

[55]  Hava T. Siegelmann,et al.  On the power of sigmoid neural networks , 1993, COLT '93.