Memory Transformation Enhances Reinforcement Learning in Dynamic Environments

Over the course of systems consolidation, there is a switch from a reliance on detailed episodic memories to generalized schematic memories. This switch is sometimes referred to as “memory transformation.” Here we demonstrate a previously unappreciated benefit of memory transformation, namely, its ability to enhance reinforcement learning in a dynamic environment. We developed a neural network that is trained to find rewards in a foraging task where reward locations are continuously changing. The network can use memories for specific locations (episodic memories) and statistical patterns of locations (schematic memories) to guide its search. We find that switching from an episodic to a schematic strategy over time leads to enhanced performance due to the tendency for the reward location to be highly correlated with itself in the short-term, but regress to a stable distribution in the long-term. We also show that the statistics of the environment determine the optimal utilization of both types of memory. Our work recasts the theoretical question of why memory transformation occurs, shifting the focus from the avoidance of memory interference toward the enhancement of reinforcement learning across multiple timescales. SIGNIFICANCE STATEMENT As time passes, memories transform from a highly detailed state to a more gist-like state, in a process called “memory transformation.” Theories of memory transformation speak to its advantages in terms of reducing memory interference, increasing memory robustness, and building models of the environment. However, the role of memory transformation from the perspective of an agent that continuously acts and receives reward in its environment is not well explored. In this work, we demonstrate a view of memory transformation that defines it as a way of optimizing behavior across multiple timescales.

[1]  W. Scoville,et al.  LOSS OF RECENT MEMORY AFTER BILATERAL HIPPOCAMPAL LESIONS , 1957, Journal of neurology, neurosurgery, and psychiatry.

[2]  D. Marr A theory for cerebral neocortex , 1970, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[3]  D Marr,et al.  Simple memory: a theory for archicortex. , 1971, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[4]  David Marr,et al.  Vision: A computational investigation into the human representation , 1983 .

[5]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[6]  L. Squire,et al.  The primate hippocampal formation: evidence for a time-limited role in memory storage. , 1990, Science.

[7]  M. Fanselow,et al.  Modality-specific retrograde amnesia of fear. , 1992, Science.

[8]  L. Squire,et al.  Retrograde amnesia and memory consolidation: a neurobiological perspective , 1995, Current Opinion in Neurobiology.

[9]  James L. McClelland,et al.  Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.

[10]  Raúl Rojas,et al.  Neural Networks - A Systematic Introduction , 1996 .

[11]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[12]  B. McNaughton,et al.  Reactivation of Hippocampal Cell Assemblies: Effects of Behavioral State, Experience, and EEG Dynamics , 1999, The Journal of Neuroscience.

[13]  David J. Foster,et al.  A model of hippocampally dependent navigation, using the temporal difference learning rule , 2000, Hippocampus.

[14]  Szabolcs Káli,et al.  Hippocampally-Dependent Consolidation in a Hierarchical Model of Neocortex , 2000, NIPS.

[15]  R. O’Reilly,et al.  Conjunctive representations in learning and memory: principles of cortical and hippocampal function. , 2001, Psychological review.

[16]  Sarah S. Chance,et al.  Decisions and the evolution of memory: multiple systems, multiple functions. , 2002, Psychological review.

[17]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[18]  P. Dayan,et al.  Off-line replay maintains declarative memories in a model of hippocampal-neocortical interactions , 2004, Nature Neuroscience.

[19]  P. Frankland,et al.  The organization of recent and remote memories , 2005, Nature Reviews Neuroscience.

[20]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Richard G M Morris,et al.  Dopaminergic modulation of the persistence of one-trial hippocampus-dependent memory. , 2006, Learning & memory.

[23]  G. Winocur,et al.  The cognitive neuroscience of remote episodic, semantic and spatial memory , 2006, Current Opinion in Neurobiology.

[24]  A. Hupbach,et al.  Reconsolidation of episodic memories: a subtle reminder triggers integration of new information. , 2007, Learning & memory.

[25]  D. R. Euston,et al.  Fast-Forward Playback of Recent Memory Sequences in Prefrontal Cortex During Sleep , 2007, Science.

[26]  Alcino J. Silva,et al.  Memory for context becomes less specific with time. , 2007, Learning & memory.

[27]  Peter Dayan,et al.  Hippocampal Contributions to Control: The Third Way , 2007, NIPS.

[28]  G. Winocur,et al.  Memory consolidation or transformation: context manipulation and hippocampal representations of memory , 2007, Nature Neuroscience.

[29]  D. Hassabis,et al.  Deconstructing episodic memory with construction , 2007, Trends in Cognitive Sciences.

[30]  Dorothy Tse,et al.  References and Notes Supporting Online Material Materials and Methods Figs. S1 to S5 Tables S1 to S3 Electron Impact (ei) Mass Spectra Chemical Ionization (ci) Mass Spectra References Schemas and Memory Consolidation Research Articles Research Articles Research Articles Research Articles , 2022 .

[31]  P. Dayan,et al.  Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[32]  T. Poggio,et al.  BOOK REVIEW David Marr’s Vision: floreat computational neuroscience VISION: A COMPUTATIONAL INVESTIGATION INTO THE HUMAN REPRESENTATION AND PROCESSING OF VISUAL INFORMATION , 2009 .

[33]  R. Morris,et al.  Hippocampal-neocortical interactions in memory formation, consolidation, and reconsolidation. , 2010, Annual review of psychology.

[34]  Marcia K. Johnson,et al.  Implicit Perceptual Anticipation Triggered by Statistical Learning , 2010, The Journal of Neuroscience.

[35]  G. Winocur,et al.  Memory formation and long-term retention in humans and animals: Convergence towards a transformation account of hippocampal–neocortical interactions , 2010, Neuropsychologia.

[36]  R. Morris,et al.  Dopamine and Memory: Modulation of the Persistence of Memory for Novel Hippocampal NMDA Receptor-Dependent Paired Associates , 2010, The Journal of Neuroscience.

[37]  Dorothy Tse,et al.  Schema-Dependent Gene Activation and Memory Encoding in Neocortex , 2011, Science.

[38]  Penelope A. Lewis,et al.  Sleep-dependent consolidation of statistical learning , 2011, Neuropsychologia.

[39]  G. Winocur,et al.  Memory Transformation and Systems Consolidation , 2011, Journal of the International Neuropsychological Society.

[40]  N. Daw,et al.  The ubiquity of model-based reinforcement learning , 2012, Current Opinion in Neurobiology.

[41]  James L. McClelland,et al.  Generalization Through the Recurrent Interaction of Episodic Memories , 2012, Psychological review.

[42]  R. Dolan,et al.  Dopamine Enhances Model-Based over Model-Free Choice Behavior , 2012, Neuron.

[43]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[44]  P. Dayan,et al.  Goals and Habits in the Brain , 2013, Neuron.

[45]  James L. McClelland Incorporating rapid neocortical learning of new schema-consistent information into complementary learning systems theory. , 2013, Journal of experimental psychology. General.

[46]  Vanessa E. Ghosh,et al.  What is a memory schema? A historical perspective on current neuroscience literature , 2014, Neuropsychologia.

[47]  Adam Santoro,et al.  Patterns across multiple memories are identified over time , 2014, Nature Neuroscience.

[48]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[49]  N. Daw,et al.  Multiple memory systems as substrates for multiple decision systems , 2015, Neurobiology of Learning and Memory.

[50]  Morris Moscovitch,et al.  Recovering and preventing loss of detailed memory: differential rates of forgetting for detail types in episodic memory , 2016, Learning & memory.

[51]  James L. McClelland,et al.  What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , 2016, Trends in Cognitive Sciences.