Dynamic preferences account for inter-animal variability during the continual learning of a cognitive task

Individual animals perform tasks in different ways, yet the nature and origin of that variability is poorly understood. In the context of spatial memory tasks, variability is often interpreted as resulting from differences in memory ability, but the validity of this interpretation is seldom tested since we lack a systematic approach for identifying and understanding factors that make one animal’s behavior different than another. Here we identify such factors in the context of spatial alternation in rats, a task often described as relying solely on memory of past choices. We combine hypothesis-driven behavioral design and reinforcement learning modeling to identify spatial preferences that, when combined with memory, support learning of a spatial alternation task. Identifying these preferences allows us to capture differences among animals, including differences in overall learning ability. Our results show that to understand the complexity of behavior requires quantitative accounts of the preferences of each animal.

[1]  Louis D. Matzel,et al.  Exploration in outbred mice covaries with general learning abilities irrespective of stress reactivity, emotionality, and physical attributes , 2006, Neurobiology of Learning and Memory.

[2]  J. D. McGaugh,et al.  Inactivation of Hippocampus or Caudate Nucleus with Lidocaine Differentially Affects Expression of Place and Response Learning , 1996, Neurobiology of Learning and Memory.

[3]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[4]  Brigitta Stockinger,et al.  Meningeal γδ T cell–derived IL-17 controls synaptic plasticity and short-term memory , 2019, Science Immunology.

[5]  X. Zhuang,et al.  Faculty Opinions recommendation of A selective role for dopamine in stimulus-reward learning. , 2010 .

[6]  Peter Dayan,et al.  Memory Alone Does Not Account for the Way Rats Learn a Simple Spatial Alternation Task , 2020, The Journal of Neuroscience.

[7]  W. Newsome,et al.  Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[8]  T. Robbins,et al.  Decision Making, Affect, and Learning: Attention and Performance XXIII , 2011 .

[9]  G. Buzsáki,et al.  Long-duration hippocampal sharp wave ripples improve memory , 2019, Science.

[10]  Loren J. Martin,et al.  Olfactory exposure to males, including men, causes stress and related analgesia in rodents , 2014, Nature Methods.

[11]  Hong Jiang,et al.  Dietary salt promotes cognitive impairment through tau phosphorylation , 2019, Nature.

[12]  Marion Rivalan,et al.  An Automated, Experimenter-Free Method for the Standardised, Operant Cognitive Testing of Rats , 2017, PloS one.

[13]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[14]  Guy Lever,et al.  Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.

[15]  Michael O'Rourke,et al.  Carving Nature at its Joints , 2011 .

[16]  Robert E. Schmidt,et al.  A complement–microglial axis drives synapse loss during virus-induced memory impairment , 2016, Nature.

[17]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[18]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[19]  Eric A. Zilli,et al.  Modeling the role of working memory and episodic memory in behavioral tasks , 2008, Hippocampus.

[20]  Jukka Corander,et al.  Fundamentals and Recent Developments in Approximate Bayesian Computation. , 2016, Systematic biology.

[21]  Atsushi Takata,et al.  Recapitulation and Reversal of Schizophrenia-Related Phenotypes in Setd1a-Deficient Mice , 2019, Neuron.

[22]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[23]  Bin Xu,et al.  Recapitulation and Reversal of Schizophrenia-Related Phenotypes in Setd1a-Deficient Mice , 2019, Neuron.

[24]  Y. Niv,et al.  Learning latent structure: carving nature at its joints , 2010, Current Opinion in Neurobiology.

[25]  L. Frank,et al.  Awake Hippocampal Sharp-Wave Ripples Support Spatial Memory , 2012, Science.

[26]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[27]  Peter Dayan,et al.  Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-step Task , 2015, bioRxiv.

[28]  Kenneth D. Harris,et al.  Neuron NeuroView An International Laboratory for Systems and Computational Neuroscience , 2018 .

[29]  N. Daw,et al.  Signals in Human Striatum Are Appropriate for Policy Update Rather than Value Prediction , 2011, The Journal of Neuroscience.

[30]  L. Frank,et al.  Rewarded Outcomes Enhance Reactivation of Experience in the Hippocampus , 2009, Neuron.

[31]  K. Doya,et al.  Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[32]  M. Wilson,et al.  Trajectory Encoding in the Hippocampus and Entorhinal Cortex , 2000, Neuron.

[33]  Rafal Bogacz,et al.  Learning to use working memory: a reinforcement learning gating model of rule acquisition in rats , 2012, Front. Comput. Neurosci..

[34]  Peter Dayan,et al.  Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task , 2015, bioRxiv.

[35]  Mattias P. Karlsson,et al.  Network Dynamics Underlying the Formation of Sparse, Informative Representations in the Hippocampus , 2008, The Journal of Neuroscience.

[36]  L. Frank,et al.  Behavioral/Systems/Cognitive Hippocampal Plasticity across Multiple Days of Exposure to Novel Environments , 2022 .

[37]  Surya Ganguli,et al.  Continual Learning Through Synaptic Intelligence , 2017, ICML.

[38]  P. Strata,et al.  Learning-related feedforward inhibitory connectivity growth required for memory precision , 2011, Nature.

[39]  M. Lee,et al.  Bayesian Cognitive Modeling: A Practical Course , 2014 .

[40]  P. Glimcher,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .

[41]  T. Robinson,et al.  An Animal Model of Genetic Vulnerability to Behavioral Disinhibition and Responsiveness to Reward-Related Cues: Implications for Addiction , 2010, Neuropsychopharmacology.

[42]  J. Gordon,et al.  Impaired hippocampal–prefrontal synchrony in a genetic mouse model of schizophrenia , 2010, Nature.

[43]  N. Daw,et al.  Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework , 2017, Annual review of psychology.

[44]  P. Dudchenko An overview of the tasks used to test working memory in rodents , 2004, Neuroscience & Biobehavioral Reviews.

[45]  M. Gutmann,et al.  Fundamentals and Recent Developments in Approximate Bayesian Computation , 2016, Systematic biology.

[46]  W. Gerstner,et al.  Stress, genotype and norepinephrine in the prediction of mouse behavior using reinforcement learning , 2009, Nature Neuroscience.

[47]  Hong Jiang,et al.  Dietary salt promotes cognitive impairment through tau phosphorylation , 2018, Nature.

[48]  John P O'Doherty,et al.  Model-based approaches to neuroimaging: combining reinforcement learning theory with fMRI data. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[49]  Bingni W. Brunton,et al.  Rats and Humans Can Optimally Accumulate Evidence for Decision-Making , 2013, Science.

[50]  Nathaniel D. Daw,et al.  Trial-by-trial data analysis using computational models , 2011 .

[51]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[52]  M. Packard,et al.  Factors that influence the relative use of multiple memory systems , 2013, Hippocampus.

[53]  Lan Xiao,et al.  Myelin degeneration and diminished myelin renewal contribute to age-related deficits in memory , 2020, Nature Neuroscience.

[54]  Yu Tian Wang,et al.  Synaptotagmin-3 drives AMPA receptor endocytosis, depression of synapse strength, and forgetting , 2019, Science.

[55]  Huda Akil,et al.  Selective Breeding for Divergence in Novelty-seeking Traits: Heritability and Enrichment in Spontaneous Anxiety-related Behaviors , 2006, Behavior genetics.

[56]  Wenbo Tang,et al.  Dynamics of Awake Hippocampal-Prefrontal Replay for Spatial Learning and Memory-Guided Decision Making , 2019, Neuron.

[57]  Risa Kawai,et al.  A Fully Automated High-Throughput Training System for Rodents , 2013, PloS one.

[58]  L. Frank,et al.  Hippocampal Lesions Impair Rapid Learning of a Continuous Spatial Alternation Task , 2009, PloS one.

[59]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[60]  W. Schultz,et al.  A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.

[61]  Raymond S. Nickerson,et al.  Attention and Performance Viii , 2014 .

[62]  R Plomin,et al.  Evidence for general cognitive ability (g) in heterogeneous stock mice and an analysis of potential confounds , 2002, Genes, brain, and behavior.

[63]  Jonathan D. Cohen,et al.  Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement , 2008, NIPS.

[64]  Louis D Matzel,et al.  Individual Differences in the Expression of a “General” Learning Ability in Mice , 2003, The Journal of Neuroscience.

[65]  Margaret F. Carr,et al.  Experience-Dependent Development of Coordinated Hippocampal Spatial Activity Representing the Similarity of Related Locations , 2010, The Journal of Neuroscience.

[66]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.