Everyone agrees that real cognition requires much more than static pattern recognition. In particular, it requires the ability to learn sequences of patterns (or actions) But learning sequences really means being able to learn multiple sequences, one after the other, wi thout the most recently learned ones erasing the previously learned ones. But if catastrophic interference is a problem for the sequential learning of individual patterns, the problem is amplified many times over when multiple sequences of patterns have to be learned consecutively, because each new sequence consists of many linked patterns. In this paper we will present a connectionist architecture that would seem to solve the problem of multiple sequence learning using pseudopatterns. Introduction Building a robot that could unfailingly recognize and respond to hundreds of objects in the world – apples, mice, telephones and paper napkins, among them – would unquestionably constitute a major artificial intelligence tour de force. But everyone agrees that real cognition requires much more than static pattern recognition. In particular, it requires the ability to learn sequences of patterns (or actions). This was the primary reason for the development of the simple recurrent network (SRN, Elman, 1990) and the many variants of this architecture. But learning sequences means more than being able to learn a single, isolated sequence of patterns: it means being able to learn multiple sequences, one after the other, without the most recently learned ones erasing the previously learned ones. But if catastrophic interference – the phenomenon whereby new learning completely erases old learning – is a problem with static pattern learning (McCloskey & Cohen, 1989; Ratcliff, 1990), the problem is amplified many times over when multiple sequences of patterns have to be learned consecutively, because each sequence consists of many new linked patterns. What hope is there for a previously learned sequence of patterns to survive after the network has learned a new sequence consisting of many individual patterns? In this paper, we will present a connectionist architecture that solves the problem of multiple sequence learning. Catastrophic interference The problem of catastrophic interference (or forgetting) has been with the connectionist community for well over a decade now (McCloskey & Cohen, 1989; Ratcliff, 1990; for a review see Sharkey & Sharkey, 1995). Catastrophic forgetting occurs when newly learned information suddenly and completely erases information that was previously learned by the network, a phenomenon that is not only implausible cognitively, but disastrous for most practical applications. The problem has been studied by numerous authors over the past decade (see French, 1999 for a review). The problem is that the very property – a single set of weights to encode information – that gives connectionist networks their remarkable abilities of generalization and graceful degradation in the presence of incomplete information are also the root cause of catastrophic interference (see, for example, French, 1992). Various authors (Ans & Rousset, 1997, 2000; French, 1997; Robins, 1995) have developed systems that rehearse on pseudo-episodes (or pseudopatterns), rather than on the real items that were previously learned. The basic principle of this mechanism is when learning new external patterns to interleave them with internally-generated pseudopatterns. These latter patterns, self-generated by the network from random activation, reflect (but are not identical to) the previously learned information. It has now been established that this pseudopattern rehearsal method effectively eliminates catastrophic forgetting. A serious problem remains, however, and that is this: cognition involves more than being able to sequentially learn a series of "static" (non-temporal) patterns without interference. It is of equal importance to be able to serially learn many of temporal sequences of patterns. We will propose an pseudopattern-based architecture that can effectively learn multiple temporal pat terns consecutively. The key insight of this paper is this: Once an SRN has learned a particular sequence, each pseudopattern generated by that network reflects the entire sequence (or set of sequences) that has been learned .
[1]
James L. McClelland,et al.
Understanding normal and impaired word reading: computational principles in quasi-regular domains.
,
1996,
Psychological review.
[2]
Bernard Ans,et al.
Neural networks with a self-refreshing memory: Knowledge transfer in sequential learning tasks without catastrophic forgetting
,
2000,
Connect. Sci..
[3]
R. French.
Catastrophic forgetting in connectionist networks
,
1999,
Trends in Cognitive Sciences.
[4]
James L. McClelland,et al.
Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory.
,
1995,
Psychological review.
[5]
Arnaud Destrebecqz,et al.
Incremental sequence learning
,
1996
.
[6]
Robert M. French,et al.
Pseudo-recurrent Connectionist Networks: An Approach to the 'Sensitivity-Stability' Dilemma
,
1997,
Connect. Sci..
[7]
R Ratcliff,et al.
Connectionist models of recognition memory: constraints imposed by learning and forgetting functions.
,
1990,
Psychological review.
[8]
Anthony V. Robins,et al.
Catastrophic Forgetting, Rehearsal and Pseudorehearsal
,
1995,
Connect. Sci..
[9]
Noel E. Sharkey,et al.
An Analysis of Catastrophic Interference
,
1995,
Connect. Sci..
[10]
Jeffrey L. Elman,et al.
Finding Structure in Time
,
1990,
Cogn. Sci..
[11]
Robert M. French,et al.
Semi-distributed Representations and Catastrophic Forgetting in Connectionist Networks
,
1992
.
[12]
Geoffrey E. Hinton.
Connectionist Learning Procedures
,
1989,
Artif. Intell..
[13]
Michael McCloskey,et al.
Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem
,
1989
.
[14]
Andrew S. Noetzel,et al.
Forced Simple Recurrent Neural Networks and Grammatical Inference
,
1992
.
[15]
L’oubli catastrophique it,et al.
Avoiding catastrophic forgetting by coupling two reverberating neural networks
,
2004
.