Supplementary Eye Field Encodes Reward Prediction Error

The outcomes of many decisions are uncertain and therefore need to be evaluated. We studied this evaluation process by recording neuronal activity in the supplementary eye field (SEF) during an oculomotor gambling task. While the monkeys awaited the outcome, SEF neurons represented attributes of the chosen option, namely, its expected value and the uncertainty of this value signal. After the gamble result was revealed, a number of neurons reflected the actual reward outcome. Other neurons evaluated the outcome by encoding the difference between the reward expectation represented during the delay period and the actual reward amount (i.e., the reward prediction error). Thus, SEF encodes not only reward prediction error but also all the components necessary for its computation: the expected and the actual outcome. This suggests that SEF might actively evaluate value-based decisions in the oculomotor domain, independent of other brain regions.

[1]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[2]  S. Kakade,et al.  Learning and selective attention , 2000, Nature Neuroscience.

[3]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[4]  P. Holland,et al.  Amygdala circuitry in attentional and representational processes , 1999, Trends in Cognitive Sciences.

[5]  Veit Stuphorn,et al.  Role of supplementary eye field in saccade initiation: executive, not direct, control. , 2010, Journal of neurophysiology.

[6]  M. J. Friedlander,et al.  The time course and amplitude of EPSPs evoked at synapses between pairs of CA3/CA1 neurons in the hippocampal slice , 1990, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[7]  Joseph J. Paton,et al.  Expectation Modulates Neural Responses to Pleasant and Aversive Stimuli in Primate Amygdala , 2007, Neuron.

[8]  Guillem R. Esber,et al.  Neural Correlates of Variations in Event Processing during Learning in Basolateral Amygdala , 2010, The Journal of Neuroscience.

[9]  J. Pearce,et al.  A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980, Psychological review.

[10]  Leslie G. Ungerleider,et al.  Increased Activity in Human Visual Cortex during Directed Attention in the Absence of Visual Stimulation , 1999, Neuron.

[11]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[12]  K. Preuschoff,et al.  Adding Prediction Risk to the Theory of Reward Learning , 2007, Annals of the New York Academy of Sciences.

[13]  Keiji Tanaka,et al.  Medial prefrontal cell activity signaling prediction errors of action values , 2007, Nature Neuroscience.

[14]  M. Pelley The Role of Associative History in Models of Associative Learning: A Selective Review and a Hybrid Model: , 2004 .

[15]  R. Solomon,et al.  An opponent-process theory of motivation. I. Temporal dynamics of affect. , 1974, Psychological review.

[16]  Daeyeol Lee,et al.  Behavioral and Neural Changes after Gains and Losses of Conditioned Reinforcers , 2009, The Journal of Neuroscience.

[17]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[18]  Jung Hoon Sul,et al.  Distinct Roles of Rodent Orbitofrontal and Medial Prefrontal Cortex in Decision Making , 2010, Neuron.

[19]  Uwe Mattler,et al.  Combined Expectancy Effects are Modulated by the Relation between Expectancy Cues , 2004, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[20]  M. Platt,et al.  Risk-sensitive neurons in macaque posterior cingulate cortex , 2005, Nature Neuroscience.

[21]  Angela J. Yu,et al.  Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[22]  Simon Hong,et al.  The Globus Pallidus Sends Reward-Related Signals to the Lateral Habenula , 2008, Neuron.

[23]  S. Ochs Integrative Activity of the Brain: An Interdisciplinary Approach , 1968 .

[24]  Takamitsu Sawa,et al.  Information criteria for discriminating among alternative regression models / BEBR No. 455 , 1978 .

[25]  O. Hikosaka,et al.  Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[26]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[27]  M. Nicolelis,et al.  Neuronal Ensemble Bursting in the Basal Forebrain Encodes Salience Irrespective of Valence , 2008, Neuron.

[28]  Jung Hoon Sul,et al.  Role of Striatum in Updating Values of Chosen Actions , 2009, The Journal of Neuroscience.

[29]  H. Seo,et al.  Temporal Filtering of Reward Signals in the Dorsal Anterior Cingulate Cortex during a Mixed-Strategy Game , 2007, The Journal of Neuroscience.

[30]  Clay B. Holroyd,et al.  The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. , 2002, Psychological review.

[31]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[32]  O. Hikosaka,et al.  Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.

[33]  M. Roesch,et al.  Neuronal activity in macaque SEF and ACC during performance of tasks involving conflict. , 2005, Journal of neurophysiology.

[34]  O. Hikosaka,et al.  Representation of negative motivational value in the primate lateral habenula , 2009, Nature Neuroscience.

[35]  Philippe Mailly,et al.  Relationship between the corticostriatal terminals from areas 9 and 46, and those from area 8A, dorsal and rostral premotor cortex and area 24c: an anatomical substrate for cognition to action , 2007, The European journal of neuroscience.

[36]  W. Schultz,et al.  Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[37]  R. Luce Utility of Gains and Losses: Measurement-Theoretical and Experimental Approaches , 2000 .

[38]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[39]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[40]  J Schlag,et al.  Reward-predicting and reward-detecting neuronal activity in the primate supplementary eye field. , 2000, Journal of neurophysiology.

[41]  V. Stuphorn,et al.  Supplementary eye field encodes option and action value for saccades with variable reward. , 2010, Journal of neurophysiology.

[42]  R. Solomon,et al.  An Opponent-Process Theory of Motivation , 1978 .

[43]  Claus C. Hilgetag,et al.  Sequence of information processing for emotions based on the anatomic dialogue between prefrontal cortex and amygdala , 2007, NeuroImage.

[44]  W. Schultz,et al.  Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[45]  Benjamin Y. Hayden,et al.  Posterior Cingulate Cortex Mediates Outcome-Contingent Allocation of Behavior , 2008, Neuron.

[46]  Eric Chown,et al.  Cognitive Modeling , 2014, Computing Handbook, 3rd ed..

[47]  A. Cooper,et al.  Predictive Reward Signal of Dopamine Neurons , 2011 .

[48]  M. L. Le Pelley The Role of Associative History in Models of Associative Learning: A Selective Review and a Hybrid Model , 2004, The Quarterly journal of experimental psychology. B, Comparative and physiological psychology.

[49]  S. Quartz,et al.  Neural Differentiation of Expected Reward and Risk in Human Subcortical Structures , 2006, Neuron.

[50]  J. Schall,et al.  Performance monitoring by the supplementary eye ® eld , 2000 .

[51]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[52]  J. Pearce,et al.  A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980 .

[53]  J. Kaas,et al.  Supplementary eye field as defined by intracortical microstimulation: Connections in macaques , 1990, The Journal of comparative neurology.