Cortical and Hippocampal Correlates of Deliberation during Model-Based Decisions for Rewards in Humans

How do we use our memories of the past to guide decisions we've never had to make before? Although extensive work describes how the brain learns to repeat rewarded actions, decisions can also be influenced by associations between stimuli or events not directly involving reward — such as when planning routes using a cognitive map or chess moves using predicted countermoves — and these sorts of associations are critical when deciding among novel options. This process is known as model-based decision making. While the learning of environmental relations that might support model-based decisions is well studied, and separately this sort of information has been inferred to impact decisions, there is little evidence concerning the full cycle by which such associations are acquired and drive choices. Of particular interest is whether decisions are directly supported by the same mnemonic systems characterized for relational learning more generally, or instead rely on other, specialized representations. Here, building on our previous work, which isolated dual representations underlying sequential predictive learning, we directly demonstrate that one such representation, encoded by the hippocampal memory system and adjacent cortical structures, supports goal-directed decisions. Using interleaved learning and decision tasks, we monitor predictive learning directly and also trace its influence on decisions for reward. We quantitatively compare the learning processes underlying multiple behavioral and fMRI observables using computational model fits. Across both tasks, a quantitatively consistent learning process explains reaction times, choices, and both expectation- and surprise-related neural activity. The same hippocampal and ventral stream regions engaged in anticipating stimuli during learning are also engaged in proportion to the difficulty of decisions. These results support a role for predictive associations learned by the hippocampal memory system to be recalled during choice formation.

[1]  J. Stevens,et al.  Animal Intelligence , 1883, Nature.

[2]  E. Tolman Cognitive maps in rats and men. , 1948, Psychological review.

[3]  R. R. Bush,et al.  A Stochastic Model with Applications to Learning , 1953 .

[4]  H P BAHRICK,et al.  Incidental learning under two incentive conditions. , 1954, Journal of experimental psychology.

[5]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[6]  W. F. Prokasy,et al.  Classical conditioning II: Current research and theory. , 1972 .

[7]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[8]  Roger Ratcliff,et al.  A Theory of Memory Retrieval. , 1978 .

[9]  R. Passingham The hippocampus as a cognitive map J. O'Keefe & L. Nadel, Oxford University Press, Oxford (1978). 570 pp., £25.00 , 1979, Neuroscience.

[10]  S. Lea,et al.  Contemporary Animal Learning Theory, Anthony Dickinson. Cambridge University Press, Cambridge (1981), xii, +177 pp. £12.50 hardback, £3.95 paperback , 1981 .

[11]  M. Packard,et al.  Differential effects of fornix and caudate nucleus lesions on two radial maze tasks: evidence for multiple memory systems , 1989, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[12]  D. G. Davis,et al.  Memory for Reward in Probabilistic Choice: Markovian and Non-Markovian Properties , 1990 .

[13]  J. D. McGaugh,et al.  Double dissociation of fornix and caudate nucleus lesions on acquisition of two water maze tasks: further evidence for multiple memory systems. , 1992, Behavioral neuroscience.

[14]  L. Squire Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. , 1992, Psychological review.

[15]  R. J. McDonald,et al.  A triple dissociation of memory systems: hippocampus, amygdala, and dorsal striatum. , 1993, Behavioral neuroscience.

[16]  H. Eichenbaum,et al.  Memory, amnesia, and the hippocampal system , 1993 .

[17]  R. Desimone,et al.  The representation of stimulus familiarity in anterior inferior temporal cortex. , 1993, Journal of neurophysiology.

[18]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[19]  O. Hikosaka Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.

[20]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[21]  R. Malach,et al.  Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[22]  U. Mayr,et al.  Spatial attention and implicit sequence learning: evidence for independent learning of spatial and nonspatial sequences. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[23]  Jennifer A. Mangels,et al.  A Neostriatal Habit Learning System in Humans , 1996, Science.

[24]  Karl J. Friston,et al.  Nonlinear Regression in Parametric Activation Studies , 1996, NeuroImage.

[25]  H. Eichenbaum,et al.  Conservation of hippocampal memory function in rats and humans , 1996, Nature.

[26]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[27]  Karl J. Friston,et al.  Event-related fMRI , 1997 .

[28]  D H Brainard,et al.  The Psychophysics Toolbox. , 1997, Spatial vision.

[29]  H. Eichenbaum,et al.  The hippocampus and memory for orderly stimulus relations. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[30]  N. Kanwisher,et al.  The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception , 1997, The Journal of Neuroscience.

[31]  Nancy Kanwisher,et al.  A cortical representation of the local visual environment , 1998, Nature.

[32]  Karl J. Friston,et al.  Nonlinear event‐related responses in fMRI , 1998, Magnetic resonance in medicine.

[33]  Karl J. Friston,et al.  Generalisability, Random Effects & Population Inference , 1998, NeuroImage.

[34]  Alex Martin,et al.  Properties and mechanisms of perceptual priming , 1998, Current Opinion in Neurobiology.

[35]  A. Redish Beyond the Cognitive Map: From Place Cells to Episodic Memory , 1999 .

[36]  Daniel B. Willingham,et al.  Implicit motor sequence learning is not purely perceptual , 1999, Memory & cognition.

[37]  L. Nystrom,et al.  Tracking the hemodynamic responses to reward and punishment in the striatum. , 2000, Journal of neurophysiology.

[38]  N. Kanwisher,et al.  The Human Body , 2001 .

[39]  N. Tzourio-Mazoyer,et al.  Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain , 2002, NeuroImage.

[40]  J. Gold,et al.  Banburismus and the Brain Decoding the Relationship between Sensory Stimuli, Decisions, and Reward , 2002, Neuron.

[41]  H. Pashler STEVENS' HANDBOOK OF EXPERIMENTAL PSYCHOLOGY , 2002 .

[42]  B. Balleine,et al.  The Role of Learning in the Operation of Motivational Systems , 2002 .

[43]  R. Poldrack,et al.  Competition among multiple memory systems: converging evidence from animal and human brain studies , 2003, Neuropsychologia.

[44]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[45]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[46]  Michael J. Frank,et al.  By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[47]  B. Balleine,et al.  Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning , 2004, The European journal of neuroscience.

[48]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[49]  Jonathan D. Cohen,et al.  An exploration-exploitation model based on norepinepherine and dopamine activity , 2005, NIPS.

[50]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[51]  Raymond J. Dolan,et al.  Information theory, novelty and hippocampal responses: unpredicted or unpredictable? , 2005, Neural Networks.

[52]  Keying Ye,et al.  Assessment of two approximation methods for computing posterior model probabilities , 2005, Comput. Stat. Data Anal..

[53]  H. Yin,et al.  The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.

[54]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[55]  Karl J. Friston,et al.  Encoding uncertainty in the hippocampus , 2006, Neural Networks.

[56]  J. O'Doherty,et al.  The Role of the Ventromedial Prefrontal Cortex in Abstract State-Based Inference during Decision Making in Humans , 2006, The Journal of Neuroscience.

[57]  Gordon D. A. Brown,et al.  Decision by sampling , 2006, Cognitive Psychology.

[58]  N. Daw,et al.  Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making , 2007, The Journal of Neuroscience.

[59]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[60]  Adam Johnson,et al.  Neural Ensembles in CA3 Transiently Encode Paths Forward of the Animal at a Decision Point , 2007, The Journal of Neuroscience.

[61]  R. Buckner,et al.  Self-projection and the brain , 2007, Trends in Cognitive Sciences.

[62]  Iroise Dumontheil,et al.  The gateway hypothesis of rostral prefrontal cortex (area 10) function , 2007, Trends in Cognitive Sciences.

[63]  Alana T. Wong,et al.  Remembering the past and imagining the future: Common and distinct neural substrates during event construction and elaboration , 2007, Neuropsychologia.

[64]  Peter Dayan,et al.  Hippocampal Contributions to Control: The Third Way , 2007, NIPS.

[65]  D. Schacter,et al.  The cognitive neuroscience of constructive memory: remembering the past and imagining the future , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[66]  Matthew Botvinick,et al.  Goal-directed decision making in prefrontal cortex: a computational framework , 2008, NIPS.

[67]  Colin Camerer,et al.  A framework for studying the neurobiology of value-based decision making , 2008, Nature Reviews Neuroscience.

[68]  D. Schacter,et al.  The Brain's Default Network , 2008, Annals of the New York Academy of Sciences.

[69]  Jim M. Monti,et al.  Neural repetition suppression reflects fulfilled perceptual expectations , 2008, Nature Neuroscience.

[70]  Ralph Hertwig,et al.  What impacts the impact of rare events , 2008 .

[71]  Karl J. Friston,et al.  Influence of Uncertainty and Surprise on Human Corticospinal Excitability during Preparation for Action , 2008, Current Biology.

[72]  Peter Bossaerts,et al.  Neural correlates of mentalizing-related computations during strategic interactions in humans , 2008, Proceedings of the National Academy of Sciences.

[73]  D. Shohamy,et al.  Integrating Memories in the Human Brain: Hippocampal-Midbrain Encoding of Overlapping Events , 2008, Neuron.

[74]  N. Daw,et al.  Striatal Activity Underlies Novelty-Based Choice in Humans , 2008, Neuron.

[75]  P. Dayan,et al.  tHe Cognitive neuroSCienCe of Motivation and learning , 2008 .

[76]  Justin L. Vincent,et al.  Distinct cortical anatomy linked to subregions of the medial temporal lobe revealed by intrinsic functional connectivity. , 2008, Journal of neurophysiology.

[77]  D. Hassabis,et al.  Tracking the Emergence of Conceptual Knowledge during Human Decision Making , 2009, Neuron.

[78]  W. K. Simmons,et al.  Circular analysis in systems neuroscience: the dangers of double dipping , 2009, Nature Neuroscience.

[79]  Marvin M. Chun,et al.  Neural Evidence of Statistical Learning: Efficient Detection of Visual Regularities Without Awareness , 2009, Journal of Cognitive Neuroscience.

[80]  D. Lovinger,et al.  Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill , 2009, Nature Neuroscience.

[81]  N. Daw,et al.  Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values , 2009, The Journal of Neuroscience.

[82]  P. Dayan,et al.  States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[83]  Nathaniel D. Daw,et al.  Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: evidence from a model-based fMRI study , 2010, NeuroImage.

[84]  R. Buckner The role of the hippocampus in prediction and imagination. , 2010, Annual review of psychology.

[85]  Marcia K. Johnson,et al.  Implicit Perceptual Anticipation Triggered by Statistical Learning , 2010, The Journal of Neuroscience.

[86]  Jan Peters,et al.  Episodic Future Thinking Reduces Reward Delay Discounting through an Enhancement of Prefrontal-Mediotemporal Interactions , 2010, Neuron.

[87]  Marios G Philiastides,et al.  A mechanistic account of value computation in the human brain , 2010, Proceedings of the National Academy of Sciences.

[88]  Christian F. Doeller,et al.  Anterior Hippocampus and Goal-Directed Spatial Decision Making , 2011, The Journal of Neuroscience.

[89]  Nathaniel D. Daw,et al.  Trial-by-trial data analysis using computational models , 2011 .

[90]  N. Daw,et al.  Multiplicity of control in the basal ganglia: computational roles of striatal subregions , 2011, Current Opinion in Neurobiology.

[91]  A. Rangel,et al.  Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions , 2011, Proceedings of the National Academy of Sciences.

[92]  Dylan A. Simon,et al.  Neural Correlates of Forward Planning in a Spatial Decision Task in Humans , 2011, The Journal of Neuroscience.

[93]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[94]  Nathaniel D. Daw,et al.  Environmental statistics and the trade-off between model-based and TD learning in humans , 2011, NIPS.

[95]  Amir Dezfouli,et al.  Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..

[96]  C. Büchel,et al.  Functional Dissociation of Hippocampal Mechanism during Implicit Learning Based on the Domain of Associations , 2011, The Journal of Neuroscience.

[97]  N. Daw,et al.  Dissociating hippocampal and striatal contributions to sequential prediction learning , 2012, The European journal of neuroscience.

[98]  P. Dayan,et al.  Mapping value based planning and extensively trained choice in the human brain , 2012, Nature Neuroscience.

[99]  D. Kumaran,et al.  The Emergence and Representation of Knowledge about Social and Nonsocial Hierarchies , 2012, Neuron.

[100]  Lauren V. Kustner,et al.  Shaping of Object Representations in the Human Medial Temporal Lobe Based on Temporal Regularities , 2012, Current Biology.

[101]  D. Shohamy,et al.  Preference by Association: How Memory Mechanisms in the Hippocampus Bias Decisions , 2012, Science.

[102]  Alec Solway,et al.  Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. , 2012, Psychological review.

[103]  N. Daw,et al.  Generalization of value in reinforcement learning by humans , 2012, The European journal of neuroscience.

[104]  R. Dolan,et al.  Synchronization of Medial Temporal Lobe and Prefrontal Rhythms in Human Decision Making , 2013, The Journal of Neuroscience.