Diagrammatic Methods for Deriving and Relating Temporal Neural Network Algorithms

In this paper, we present an approach to derive gradient algorithms for time-dependent neural networks that is based on a set of simple block diagram manipulation rules. We further introduce a complementary approach using flow graph interreciprocity to show transformations between on-line and batch learning algorithms. This provides simple intuitive relationships between such algorithms as real-time recurrent learning, dynamic backpropagation, and backpropagation-through-time. Examples are provided for a variety of architectures to illustrate both the generality and the simplicity of the diagrammatic approaches.

[1]  S. Ramo,et al.  Fields and Waves in Communication Electronics , 1966 .

[2]  R. Spence,et al.  Tellegen's theorem and electrical networks , 1970 .

[3]  Kiyotoshi Matsuoka Learning of neural networks using their adjoint systems , 1991, Systems and Computers in Japan.

[4]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[5]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[6]  Alan V. Oppenheim,et al.  Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  S. W. Piche,et al.  Steepest descent algorithms for neural network controllers and filters , 1994, IEEE Trans. Neural Networks.

[8]  R. Palmer,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[9]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[10]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[11]  L. Griffiths A continuously-adaptive filter implemented as a lattice structure , 1977 .

[12]  K S Narendra,et al.  IDENTIFICATION AND CONTROL OF DYNAMIC SYSTEMS USING NEURAL NETWORKS , 1990 .

[13]  Arthur E. Bryson,et al.  Applied Optimal Control , 1969 .

[14]  J. Bordewijk Inter-reciprocity applied to electrical networks , 1957 .

[15]  Michael I. Jordan,et al.  Hierarchies of Adaptive Experts , 1991, NIPS.

[16]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[17]  Idan Segev,et al.  Methods in Neuronal Modeling , 1988 .

[18]  Eric A. Wan Modeling Nonlinear Dynamics with Neural Networks: Examples in Time Series Prediction , 1993 .

[19]  Lee A. Feldkamp,et al.  Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks , 1994, IEEE Trans. Neural Networks.

[20]  B. Widrow,et al.  The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.

[21]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[22]  José Carlos Príncipe,et al.  The gamma model--A new neural model for temporal processing , 1992, Neural Networks.

[23]  Eric A. Wan,et al.  Time series prediction by using a connectionist network with internal delay lines , 1993 .

[24]  Ah Chung Tsoi,et al.  FIR and IIR Synapses, a New Neural Network Architecture for Time Series Modeling , 1991, Neural Computation.

[25]  Giovanni Soda,et al.  Local Feedback Multilayered Networks , 1992, Neural Computation.

[26]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[27]  Thomas Kailath,et al.  Linear Systems , 1980 .

[28]  E. S. Plumer Time-optimal terminal control using neural networks , 1993, IEEE International Conference on Neural Networks.

[29]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[30]  Eric A. Wan,et al.  Diagrammatic Derivation of Gradient Algorithms for Neural Networks , 1996, Neural Computation.

[31]  A.V. Oppenheim,et al.  Analysis of linear digital networks , 1975, Proceedings of the IEEE.

[32]  Patrick Gallinari,et al.  A Framework for the Cooperation of Learning Algorithms , 1990, NIPS.

[33]  Eric A. Wan,et al.  Relating Real-Time Backpropagation and Backpropagation-Through-Time: An Application of Flow Graph Interreciprocity , 1994, Neural Computation.

[34]  Y. D. Landau,et al.  Adaptive control: The model reference approach , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[35]  Eric A. Wan,et al.  Finite impulse response neural networks with applications in time series prediction , 1994 .

[36]  Bernard Widrow,et al.  Optimal terminal control using feedforward neural networks , 1993 .

[37]  Louis B. Rall,et al.  Automatic Differentiation: Techniques and Applications , 1981, Lecture Notes in Computer Science.

[38]  A. Griewank,et al.  Automatic differentiation of algorithms : theory, implementation, and application , 1994 .

[39]  Pierre Roussel-Ragot,et al.  Neural Networks and Nonlinear Adaptive Filtering: Unifying Concepts and New Algorithms , 1993, Neural Computation.

[40]  Jing Peng,et al.  An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.

[41]  Idan Segev,et al.  Methods in neuronal modeling: From synapses to networks , 1989 .

[42]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[43]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[44]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[45]  Thomas Parisini,et al.  Neural networks for feedback feedforward nonlinear control systems , 1994, IEEE Trans. Neural Networks.

[46]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[47]  P. Frasconi,et al.  Local Feedback Multi-Layered Networks , 1992 .

[48]  Jacob Barhen,et al.  Learning a trajectory using adjoint functions and teacher forcing , 1992, Neural Networks.

[49]  Simon Ramo,et al.  Fields and waves in communication electronics / Simon Ramo, John R. Whinnery, Theodore van Duzer , 1984 .

[50]  Paolo Campolucci,et al.  Signal-flow-graph derivation of on-line gradient learning algorithms , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[51]  Ah Chung Tsoi,et al.  Locally recurrent globally feedforward networks: a critical review of architectures , 1994, IEEE Trans. Neural Networks.