Neural networks with a self-refreshing memory: Knowledge transfer in sequential learning tasks without catastrophic forgetting

We explore a dual-network architecture with self-refreshing memory (Ans and Rousset 1997) which overcomes catastrophic forgetting in sequential learning tasks. Its principle is that new knowledge is learned along with an internally generated activity reflecting the network history. What mainly distinguishes this model from others using pseudorehearsal in feedforward multilayer networks is a reverberating process used for generating pseudoitems. This process, which tends to go up to network attractors from random activation, is more suitable for capturing optimally the deep structure of previously learned knowledge than a single feedforward pass of activity. The proposed mechanism for ?transporting memory? without loss of information between two different brain structures could be viewed as a neurobiologically plausible means for consolidation in long-term memory. Knowledge transfer is explored with regard to learning speed, ability to generalize and vulnerability to network damages. We show that transfer is more efficient when two related tasks are sequentially learned than when they are learned concurrently. With a self-refreshing memory network knowledge can be saved for a long time and therefore reused in subsequent acquisitions.

[1]  J. Kruschke,et al.  ALCOVE: an exemplar-based connectionist model of category learning. , 1992, Psychological review.

[2]  Anthony V. Robins,et al.  Consolidation in Neural Networks and in the Sleeping Brain , 1996, Connect. Sci..

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  Stephen Grossberg,et al.  The ART of adaptive pattern recognition by a self-organizing neural network , 1988, Computer.

[5]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[6]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[7]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[8]  S. Lewandowsky,et al.  Catastrophic interference in neural networks , 1995 .

[9]  Douglas L. Hintzman,et al.  "Schema Abstraction" in a Multiple-Trace Memory Model , 1986 .

[10]  R Ratcliff,et al.  Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. , 1990, Psychological review.

[11]  E. Capaldi,et al.  The organization of behavior. , 1992, Journal of applied behavior analysis.

[12]  John K. Kruschke,et al.  Human Category Learning: Implications for Backpropagation Models , 1993 .

[13]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[14]  Anthony V. Robins,et al.  Transfer in Cognition , 1996, Connect. Sci..

[15]  Stephan Lewandowsky ON THE RELATION BETWEEN CATASTROPHIC INTERFERENCE AND GENERALIZATION IN CONNECTIONIST NETWORKS , 1994 .

[16]  B. Ans,et al.  A connectionist multiple-trace memory model for polysyllabic word reading. , 1998, Psychological review.

[17]  Pentti Kanerva,et al.  Sparse Distributed Memory , 1988 .

[18]  Robert M. French,et al.  Semi-distributed Representations and Catastrophic Forgetting in Connectionist Networks , 1992 .

[19]  Yves Coiton,et al.  A neural network model for temporal sequence learning and motor programming , 1994, Neural Networks.

[20]  John G. Taylor,et al.  Storing temporal sequences , 1991, Neural Networks.

[21]  James L. McClelland,et al.  Understanding normal and impaired word reading: computational principles in quasi-regular domains. , 1996, Psychological review.

[22]  Anthony V. Robins,et al.  Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..

[23]  K. McRae,et al.  Catastrophic Interference is Eliminated in Pretrained Networks , 1993 .

[24]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[25]  Anthony Robins Maintaining stability during new learning in neural networks , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[26]  Daniel L. Silver,et al.  The Parallel Transfer of Task Knowledge Using Dynamic Learning Rates Based on a Measure of Relatedness , 1996, Connect. Sci..

[27]  Noel E. Sharkey,et al.  An Analysis of Catastrophic Interference , 1995, Connect. Sci..

[28]  S. Wang,et al.  Connectionist modelling of a cognitive model to face identification: simulation of context effects , 1989, International 1989 Joint Conference on Neural Networks.

[29]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[30]  Lorien Y. Pratt,et al.  A Survey of Transfer Between Connectionist Networks , 1996, Connect. Sci..

[31]  R. French Dynamically constraining connectionist networks to produce distributed, orthogonal representations to reduce catastrophic interference , 2019, Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society.

[32]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[33]  Robert M. French,et al.  Pseudo-recurrent Connectionist Networks: An Approach to the 'Sensitivity-Stability' Dilemma , 1997, Connect. Sci..

[34]  Anthony V. Robins,et al.  Catastrophic Forgetting and the Pseudorehearsal Solution in Hopfield-type Networks , 1998, Connect. Sci..

[35]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .