Integration of New Information in Memory: New Insights from a Complementary Learning Systems Perspective

According to complementary learning systems theory, integrating new memories into the neocortex of the brain without interfering with what is already known depends on a gradual learning process, interleaving new items with previously learned items. However, empirical studies show that information consistent with prior knowledge can be integrated very quickly. We use artificial neural networks with properties like those we attribute to the neocortex to develop a theoretical understanding of the role of consistency with prior knowledge in putatively neocortex-like learning systems, providing new insights into when integration will be fast or slow and how integration might be made more efficient when the items to be learned are hierarchically structured. The work relies on deep linear networks that capture the qualitative aspects of the learning dynamics of the more complex non-linear networks used in previous work. The time course of learning in these networks can be linked to the hierarchical structure in the training data, captured mathematically as a set of dimensions that correspond to the branches in the hierarchy. In this context, a new item to be learned can be characterized as having aspects that project onto previously known dimensions, and others that require adding a new branch/dimension. The projection onto the known dimensions can be learned rapidly without interleaving, but learning the new dimension requires gradual interleaved learning. When a new item only overlaps with items within one branch of a hierarchy, interleaving can focus on the previously-known items within this branch, resulting in faster integration with less inter-leaving overall. The discussion considers how the brain might exploit these facts to make learning more efficient and highlights predictions about what aspects of new information might be hard or easy to learn.

[1]  B. McNaughton,et al.  Spontaneous Changes of Neocortical Code for Associative Memory During Consolidation , 2008, Science.

[2]  Stefano Fusi,et al.  The Sparseness of Mixed Selectivity Neurons Controls the Generalization–Discrimination Trade-Off , 2013, The Journal of Neuroscience.

[3]  R. Henson,et al.  How schema and novelty augment memory formation , 2012, Trends in Neurosciences.

[4]  Dorothy Tse,et al.  Schema-Dependent Gene Activation and Memory Encoding in Neocortex , 2011, Science.

[5]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[6]  B. McNaughton,et al.  Reactivation of hippocampal ensemble memories during sleep. , 1994, Science.

[7]  Brad E. Pfeiffer,et al.  Hippocampal place cell sequences depict future paths to remembered goals , 2013, Nature.

[8]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[9]  D Marr,et al.  Simple memory: a theory for archicortex. , 1971, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[10]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[11]  Jonathon Shlens,et al.  A Tutorial on Principal Component Analysis , 2014, ArXiv.

[12]  F. Bartlett,et al.  Remembering: A Study in Experimental and Social Psychology , 1932 .

[13]  Razvan Pascanu,et al.  Memory-based Parameter Adaptation , 2018, ICLR.

[14]  P. Lewis,et al.  Overlapping memory replay during sleep builds cognitive schemata , 2011, Trends in Cognitive Sciences.

[15]  Surya Ganguli,et al.  A mathematical theory of semantic development in deep neural networks , 2018, Proceedings of the National Academy of Sciences.

[16]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[17]  R. Sutherland,et al.  Hippocampus and retrograde amnesia in the rat model: A modest proposal for the situation of systems consolidation , 2010, Neuropsychologia.

[18]  James L. McClelland,et al.  A distributed, developmental model of word recognition and naming. , 1989, Psychological review.

[19]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[20]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[21]  James L. McClelland The Place of Modeling in Cognitive Science , 2009, Top. Cogn. Sci..

[22]  James L. McClelland,et al.  Semantic Cognition: A Parallel Distributed Processing Approach , 2004 .

[23]  James L. McClelland Incorporating rapid neocortical learning of new schema-consistent information into complementary learning systems theory. , 2013, Journal of experimental psychology. General.

[24]  Elizabeth A. McDevitt,et al.  Sleep Benefits Memory for Semantic Category Structure While Preserving Exemplar-Specific Information , 2017, bioRxiv.

[25]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[26]  James L. McClelland,et al.  Structure and deterioration of semantic memory: a neuropsychological and computational investigation. , 2004, Psychological review.

[27]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[28]  Peter M. Todd,et al.  Learning and connectionist representations , 1993 .

[29]  Jinde Cao,et al.  Introduction to Computational Neuroscience , 2016 .

[30]  James L. McClelland,et al.  The parallel distributed processing approach to semantic cognition , 2003, Nature Reviews Neuroscience.

[31]  G. Fernández,et al.  Retrieval of Associative Information Congruent with Prior Knowledge Is Related to Increased Medial Prefrontal Activity and Connectivity , 2010, The Journal of Neuroscience.

[32]  C. Pavlides,et al.  Influences of hippocampal place cell firing in the awake state on the activity of these cells during subsequent sleep episodes , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[33]  James L. McClelland,et al.  Learning and Applying Contextual Constraints in Sentence Comprehension , 1990, Artif. Intell..

[34]  M. Fanselow,et al.  Modality-specific retrograde amnesia of fear. , 1992, Science.

[35]  James L. McClelland,et al.  What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , 2016, Trends in Cognitive Sciences.

[36]  Elizabeth A. McDevitt,et al.  Human hippocampal replay during rest prioritizes weakly learned information and predicts memory performance , 2017, Nature Communications.

[37]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[38]  Dorothy Tse,et al.  References and Notes Supporting Online Material Materials and Methods Figs. S1 to S5 Tables S1 to S3 Electron Impact (ei) Mass Spectra Chemical Ionization (ci) Mass Spectra References Schemas and Memory Consolidation Research Articles Research Articles Research Articles Research Articles , 2022 .

[39]  Marijn C. W. Kroes,et al.  Initial Investigation of the Effects of an Experimentally Learned Schema on Spatial Associative Memory in Humans , 2014, The Journal of Neuroscience.