Surprise! Neural correlates of Pearce–Hall and Rescorla–Wagner coexist within the brain

Learning theory and computational accounts suggest that learning depends on errors in outcome prediction as well as changes in processing of or attention to events. These divergent ideas are captured by models, such as Rescorla–Wagner (RW) and temporal difference (TD) learning on the one hand, which emphasize errors as directly driving changes in associative strength, vs. models such as Pearce–Hall (PH) and more recent variants on the other hand, which propose that errors promote changes in associative strength by modulating attention and processing of events. Numerous studies have shown that phasic firing of midbrain dopamine (DA) neurons carries a signed error signal consistent with RW or TD learning theories, and recently we have shown that this signal can be dissociated from attentional correlates in the basolateral amygdala and anterior cingulate. Here we will review these data along with new evidence: (i) implicating habenula and striatal regions in supporting error signaling in midbrain DA neurons; and (ii) suggesting that the central nucleus of the amygdala and prefrontal regions process the amygdalar attentional signal. However, while the neural instantiations of the RW and PH signals are dissociable and complementary, they may be linked. Any linkage would have implications for understanding why one signal dominates learning in some situations and not others, and also for appreciating the potential impact on learning of neuropathological conditions involving altered DA or amygdalar function, such as schizophrenia, addiction or anxiety disorders.

[1]  L. Kamin Predictability, surprise, attention, and conditioning , 1967 .

[2]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[3]  N. Mackintosh A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement , 1975 .

[4]  J. Pearce,et al.  A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980, Psychological review.

[5]  K. Sripanidkulchai,et al.  The cortical projection of the basolateral amygdaloid nucleus in the rat: A retrograde fluorescent dye study , 1984, The Journal of comparative neurology.

[6]  M. Cassell,et al.  Topography of projections from the medial prefrontal cortex to the amygdala in the rat , 1986, Brain Research Bulletin.

[7]  M. Gallagher,et al.  The amygdala central nucleus and appetitive Pavlovian conditioning: lesions impair one class of conditioned behavior , 1990, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[8]  Richard S. Sutton,et al.  Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.

[9]  M. Gallagher,et al.  Amygdala central nucleus lesions disrupt increments, but not decrements, in conditioned stimulus processing. , 1993, Behavioral neuroscience.

[10]  P. Holland,et al.  Effects of amygdala central nucleus lesions on blocking and unblocking. , 1993, Behavioral neuroscience.

[11]  Joel L. Davis,et al.  A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[12]  W. Schultz,et al.  Importance of unpredictability for reward responses in primate dopamine neurons. , 1994, Journal of neurophysiology.

[13]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[14]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[15]  Peter Dayan,et al.  Statistical Models of Conditioning , 1997, NIPS.

[16]  M. Botvinick,et al.  Anterior cingulate cortex, error detection, and the online monitoring of performance. , 1998, Science.

[17]  The projection of the amygdaloid nuclei to various areas of the limbic cortex in the rat. , 1998, Folia morphologica.

[18]  J. Hollerman,et al.  Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[19]  P. Holland,et al.  Amygdala circuitry in attentional and representational processes , 1999, Trends in Cognitive Sciences.

[20]  M. Coles,et al.  Performance monitoring in a confusing world: error-related brain activity, judgments of response accuracy, and types of errors. , 2000, Journal of experimental psychology. Human perception and performance.

[21]  Michael Davis,et al.  The amygdala: vigilance and emotion , 2001, Molecular Psychiatry.

[22]  W. Schultz,et al.  Dopamine responses comply with basic assumptions of formal learning theory , 2001, Nature.

[23]  T. Paus Primate anterior cingulate cortex: Where motor control, drive and cognition interface , 2001, Nature Reviews Neuroscience.

[24]  Clay B. Holroyd,et al.  The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. , 2002, Psychological review.

[25]  Karl J. Friston,et al.  Temporal difference learning model accounts for responses in human ventral striatum , 2002 .

[26]  S. Kakade,et al.  Acquisition and extinction in autoshaping. , 2002, Psychological review.

[27]  B. Balleine,et al.  The Role of Learning in the Operation of Motivational Systems , 2002 .

[28]  W. Schultz,et al.  Coding of Predicted Reward Omission by Dopamine Neurons in a Conditioned Inhibition Paradigm , 2003, The Journal of Neuroscience.

[29]  G. Schoenbaum,et al.  Encoding Predicted Outcome and Acquired Value in Orbitofrontal Cortex during Cue Sampling Depends upon Input from Basolateral Amygdala , 2003, Neuron.

[30]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[31]  S. Kapur Psychosis as a state of aberrant salience: a framework linking biology, phenomenology, and pharmacology in schizophrenia. , 2003, The American journal of psychiatry.

[32]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[33]  P. Shizgal,et al.  Gambling on Dopamine , 2003, Science.

[34]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[35]  Joshua W. Brown,et al.  Performance Monitoring by the Anterior Cingulate Cortex During Saccade Countermanding , 2003, Science.

[36]  Geoffrey Schoenbaum,et al.  Different Roles for Orbitofrontal Cortex and Basolateral Amygdala in a Reinforcer Devaluation Task , 2003, The Journal of Neuroscience.

[37]  J. Bolam,et al.  Uniform Inhibition of Dopamine Neurons in the Ventral Tegmental Area by Aversive Stimuli , 2004, Science.

[38]  M. Walton,et al.  Action sets and decisions in the medial frontal cortex , 2004, Trends in Cognitive Sciences.

[39]  M. Walton,et al.  Interactions between decision making and performance monitoring within prefrontal cortex , 2004, Nature Neuroscience.

[40]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[41]  M. Pelley The Role of Associative History in Models of Associative Learning: A Selective Review and a Hybrid Model: , 2004 .

[42]  E. Murray,et al.  Bilateral Orbital Prefrontal Cortex Lesions in Rhesus Monkeys Disrupt Choices Guided by Both Reward Value and Reward Contingency , 2004, The Journal of Neuroscience.

[43]  M. L. Le Pelley The Role of Associative History in Models of Associative Learning: A Selective Review and a Hybrid Model , 2004, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[44]  Uwe Mattler,et al.  Combined Expectancy Effects are Modulated by the Relation between Expectancy Cues , 2004, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[45]  Israel Liberzon,et al.  Neural Response to Emotional Salience in Schizophrenia , 2005, Neuropsychopharmacology.

[46]  W. Pan,et al.  Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network , 2005, The Journal of Neuroscience.

[47]  P. Holland,et al.  Variations in unconditioned stimulus processing in unblocking. , 2005, Journal of experimental psychology. Animal behavior processes.

[48]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[49]  B. Balleine,et al.  Double Dissociation of Basolateral and Central Amygdala Lesions on the General and Outcome-Specific Forms of Pavlovian-Instrumental Transfer , 2005, The Journal of Neuroscience.

[50]  E. Procyk,et al.  Anterior cingulate error‐related activity is modulated by predicted reward , 2005, The European journal of neuroscience.

[51]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[52]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[53]  Timothy E. J. Behrens,et al.  Optimal decision making and the anterior cingulate cortex , 2006, Nature Neuroscience.

[54]  P. Redgrave,et al.  The short-latency dopamine signal: a role in discovering novel actions? , 2006, Nature Reviews Neuroscience.

[55]  Brian Knutson,et al.  Linking nucleus accumbens dopamine and blood oxygenation , 2007, Psychopharmacology.

[56]  John J. Foxe,et al.  The Anterior Cingulate and Error Avoidance , 2006, The Journal of Neuroscience.

[57]  R. Dolan,et al.  Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans , 2006, Nature.

[58]  Elyssa B. Margolis,et al.  The ventral tegmental area revisited: is there an electrophysiological marker for dopaminergic neurons? , 2006, The Journal of physiology.

[59]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[60]  E. Procyk,et al.  Reward encoding in the monkey anterior cingulate cortex. , 2006, Cerebral cortex.

[61]  Aaron C. Courville,et al.  Bayesian theories of conditioning in a changing world , 2006, Trends in Cognitive Sciences.

[62]  K. Preuschoff,et al.  Adding Prediction Risk to the Theory of Reward Learning , 2007, Annals of the New York Academy of Sciences.

[63]  P. Shepard,et al.  Lateral Habenula Stimulation Inhibits Rat Midbrain Dopamine Neurons through a GABAA Receptor-Mediated Mechanism , 2007, The Journal of Neuroscience.

[64]  Timothy Edward John Behrens,et al.  Contrasting roles for cingulate and orbitofrontal cortex in decisions and social behaviour , 2007, Trends in Cognitive Sciences.

[65]  David Goodman,et al.  Performance Monitoring in the Anterior Cingulate is Not All Error Related: Expectancy Deviation and the Representation of Action-Outcome Associations , 2007, Journal of Cognitive Neuroscience.

[66]  O. Hikosaka,et al.  Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[67]  Keiji Tanaka,et al.  Medial prefrontal cell activity signaling prediction errors of action values , 2007, Nature Neuroscience.

[68]  Joseph J. Paton,et al.  Expectation Modulates Neural Responses to Pleasant and Aversive Stimuli in Primate Amygdala , 2007, Neuron.

[69]  B. Balleine,et al.  Orbitofrontal Cortex Mediates Outcome Encoding in Pavlovian But Not Instrumental Conditioning , 2007, The Journal of Neuroscience.

[70]  M. Roesch,et al.  Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[71]  E. Murray The amygdala, reward and emotion , 2007, Trends in Cognitive Sciences.

[72]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[73]  E. Procyk,et al.  Expectations, gains, and losses in the anterior cingulate cortex , 2007, Cognitive, affective & behavioral neuroscience.

[74]  R. Wightman,et al.  Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens , 2007, Nature Neuroscience.

[75]  S. Floresco,et al.  The role of different subregions of the basolateral amygdala in cue-induced reinstatement and extinction of food-seeking behavior , 2007, Neuroscience.

[76]  W. Newsome,et al.  The temporal precision of reward prediction in dopamine neurons , 2008, Nature Neuroscience.

[77]  M. Rushworth,et al.  The contribution of distinct subregions of the ventromedial frontal cortex to emotion, social behavior, and decision making , 2008, Cognitive, affective & behavioral neuroscience.

[78]  S. Lammel,et al.  Unique Properties of Mesoprefrontal Neurons within a Dual Mesocorticolimbic Dopamine System , 2008, Neuron.

[79]  Colin Camerer,et al.  Dissociating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors , 2008, The Journal of Neuroscience.

[80]  Simon Hong,et al.  The Globus Pallidus Sends Reward-Related Signals to the Lateral Habenula , 2008, Neuron.

[81]  Geoffrey Schoenbaum,et al.  The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards , 2008, Nature.

[82]  Timothy E. J. Behrens,et al.  Choice, uncertainty and value in prefrontal and cingulate cortex , 2008, Nature Neuroscience.

[83]  Kay M. Tye,et al.  Rapid strengthening of thalamo-amygdala synapses mediates cue–reward learning , 2008, Nature.

[84]  Samuel M. McClure,et al.  BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area , 2008, Science.

[85]  E. Procyk,et al.  Behavioral Shifts and Action Valuation in the Anterior Cingulate Cortex , 2008, Neuron.

[86]  Jennifer M. Mitchell,et al.  Midbrain Dopamine Neurons: Projection Target Determines Action Potential Duration and Dopamine D2 Receptor Inhibition , 2008, The Journal of Neuroscience.

[87]  S. Nicola,et al.  Basolateral Amygdala Neurons Facilitate Reward-Seeking Behavior by Exciting Nucleus Accumbens Neurons , 2008, Neuron.

[88]  W. Schultz,et al.  Influence of Reward Delays on Responses of Dopamine Neurons , 2008, The Journal of Neuroscience.

[89]  Joseph J. Paton,et al.  Moment-to-Moment Tracking of State Value in the Amygdala , 2008, The Journal of Neuroscience.

[90]  S. Sesack,et al.  Lateral habenula projections to dopamine and GABA neurons in the rat ventral tegmental area , 2009, The European journal of neuroscience.

[91]  P. Veinante,et al.  Afferents to the GABAergic tail of the ventral tegmental area in the rat , 2009, The Journal of comparative neurology.

[92]  B. Balleine,et al.  Multiple Forms of Value Learning and the Function of Dopamine , 2009 .

[93]  Y. Humeau,et al.  Amygdala Inhibitory Circuits and the Control of Fear Memory , 2009, Neuron.

[94]  S. Kennerley,et al.  Evaluating choices by single neurons in the frontal lobe: outcome value encoded across multiple decision variables , 2009, The European journal of neuroscience.

[95]  Jonathan D. Wallis,et al.  Neurons in the Frontal Lobe Encode the Value of Multiple Decision Variables , 2009, Journal of Cognitive Neuroscience.

[96]  O. Hikosaka,et al.  Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.

[97]  Bita Moghaddam,et al.  Anterior Cingulate Neurons Represent Errors and Preparatory Attention within the Same Behavioral Sequence , 2009, The Journal of Neuroscience.

[98]  Mark G. Baxter,et al.  The Rostromedial Tegmental Nucleus (RMTg), a GABAergic Afferent to Midbrain Dopamine Neurons, Encodes Aversive Stimuli and Inhibits Motor Responses , 2009, Neuron.

[99]  O. Hikosaka The habenula: from stress evasion to value-based decision-making , 2010, Nature Reviews Neuroscience.

[100]  Ethan S. Bromberg-Martin,et al.  Dopamine in Motivational Control: Rewarding, Aversive, and Alerting , 2010, Neuron.

[101]  Guillem R. Esber,et al.  Neural Correlates of Variations in Event Processing during Learning in Basolateral Amygdala , 2010, The Journal of Neuroscience.

[102]  N. Mackintosh,et al.  Two theories of attention: a review and a possible integration , 2010 .

[103]  S. Kennerley,et al.  Heterogeneous reward signals in prefrontal cortex , 2010, Current Opinion in Neurobiology.

[104]  H. T. Blair,et al.  Neural substrates for expectation-modulated fear learning in the amygdala and periaqueductal gray , 2010, Nature Neuroscience.

[105]  Simon Hong,et al.  A pallidus-habenula-dopamine pathway signals inferred stimulus values. , 2010, Journal of neurophysiology.

[106]  Jackson J. Cone,et al.  Amygdala Neural Encoding of the Absence of Reward during Extinction , 2010, The Journal of Neuroscience.

[107]  M. Roesch,et al.  Neural Correlates of Variations in Event Processing during Learning in Central Nucleus of Amygdala , 2010, Neuron.

[108]  Ethan S. Bromberg-Martin,et al.  Distinct Tonic and Phasic Anticipatory Activity in Lateral Habenula and Dopamine Neurons , 2010, Neuron.

[109]  K. Brinschwitz,et al.  Glutamatergic axons from the lateral habenula mainly terminate on GABAergic neurons of the ventral midbrain , 2010, Neuroscience.

[110]  David K Bilkey,et al.  Neurons in the Rat Anterior Cingulate Cortex Dynamically Encode Cost–Benefit in a Spatial Decision-Making Task , 2010, The Journal of Neuroscience.

[111]  N. Daw,et al.  Differential roles of human striatum and amygdala in associative learning , 2011, Nature Neuroscience.

[112]  Emmanuel Procyk,et al.  Coordination of High Gamma Activity in Anterior Cingulate and Lateral Prefrontal Cortical Areas during Adaptation , 2011, The Journal of Neuroscience.

[113]  Dylan A. Simon,et al.  Neural Correlates of Forward Planning in a Spatial Decision Task in Humans , 2011, The Journal of Neuroscience.

[114]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[115]  Robert C. Wilson,et al.  Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex , 2011, Nature Neuroscience.

[116]  M. Roesch,et al.  Attention for Learning Signals in Anterior Cingulate Cortex , 2011, The Journal of Neuroscience.

[117]  John M. Pearson,et al.  Surprise Signals in Anterior Cingulate Cortex: Neuronal Encoding of Unsigned Reward Prediction Errors Driving Adjustment in Behavior , 2011, The Journal of Neuroscience.

[118]  Simon Hong,et al.  Negative Reward Signals from the Lateral Habenula to Dopamine Neurons Are Mediated by Rostromedial Tegmental Nucleus in Primates , 2011, The Journal of Neuroscience.

[119]  Guillem R. Esber,et al.  Reconciling the influence of predictiveness and uncertainty on stimulus salience: a model of attention in associative learning , 2011, Proceedings of the Royal Society B: Biological Sciences.

[120]  M. Morales,et al.  Duration of Inhibition of Ventral Tegmental Area Dopamine Neurons Encodes a Level of Conditioned Fear , 2011, The Journal of Neuroscience.

[121]  P. Holland,et al.  The effects of basolateral amygdala lesions on unblocking. , 2012, Behavioral neuroscience.