Incremental Learning for RNNs: How Does it Affect Performance and Hidden Unit Activation?

In this short paper we summarise our work [5] on training first-order recurrent neural networks (RNNs) on the abc language prediction task. We highlight the differences between incremental and non-incremental learning – with respect to success rate, generalisation performance, and characteristics of hidden unit activation.

[1]  Stephan K. Chalup,et al.  Incremental Learning in Biological and Machine Learning Systems , 2002, Int. J. Neural Syst..

[2]  Janet Wiles,et al.  On learning context-free and context-sensitive languages , 2002, IEEE Trans. Neural Networks.

[3]  Mark Steijvers,et al.  A Recurrent Network that performs a Context-Sensitive Prediction Task , 1996 .

[4]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[5]  Helko Lehmann,et al.  Computation in Recurrent Neural Networks: From Counters to Iterated Function Systems , 1998, Australian Joint Conference on Artificial Intelligence.

[6]  Jürgen Schmidhuber,et al.  LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.

[7]  Paul Rodríguez,et al.  Simple Recurrent Networks Learn Context-Free and Context-Sensitive Languages by Counting , 2001, Neural Computation.

[8]  Janet Wiles,et al.  Learning a context-free task with a recurrent neural network: An analysis of stability , 1999 .

[9]  Padraic Monaghan,et al.  Proceedings of the 23rd annual conference of the cognitive science society , 2001 .

[10]  Jordan B. Pollack,et al.  RAAM for infinite context-free languages , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[11]  Hava T. Siegelmann,et al.  Neural networks and analog computation - beyond the Turing limit , 1999, Progress in theoretical computer science.

[12]  Helko Lehmann,et al.  Designing a Counter: Another Case Study of Dynamics and Activation Landscapes in Recurrent Networks , 1997, KI.

[13]  Stephan K. Chalup,et al.  Incremental training of first order recurrent neural networks to predict a context-sensitive language , 2003, Neural Networks.

[14]  Jordan B. Pollack,et al.  Co-Evolution in the Successful Learning of Backgammon Strategy , 1998, Machine Learning.

[15]  Janet Wiles,et al.  Learning to count without a counter: A case study of dynamics and activation landscapes in recurrent networks , 1995 .

[16]  Jürgen Schmidhuber,et al.  Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets , 2003, Neural Networks.

[17]  John F. Kolen,et al.  Field Guide to Dynamical Recurrent Networks , 2001 .

[18]  Janet Wiles,et al.  Representation beyond finite states: Alternatives to pushdown automata , 2001 .

[19]  Janet Wiles,et al.  Inductive Bias in Context-Free Language Learning , 1998 .

[20]  Janet Wiles,et al.  Context-free and context-sensitive dynamics in recurrent neural networks , 2000, Connect. Sci..

[21]  Jürgen Schmidhuber,et al.  Learning Nonregular Languages: A Comparison of Simple Recurrent Networks and LSTM , 2002, Neural Computation.

[22]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[23]  Paul Rodríguez,et al.  A Recurrent Neural Network that Learns to Count , 1999, Connect. Sci..

[24]  Kenji Doya,et al.  Recurrent Networks : Learning Algorithms ∗ , 2002 .

[25]  Stephan K. Chalup,et al.  Hill climbing in recurrent neural networks for learning the a/sup n/b/sup n/c/sup n/ language , 1999, ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378).