Whole-Brain Neural Dynamics of Probabilistic Reward Prediction

Predicting future reward is paramount to performing an optimal action. Although a number of brain areas are known to encode such predictions, a detailed account of how the associated representations evolve over time is lacking. Here, we address this question using human magnetoencephalography (MEG) and multivariate analyses of instantaneous activity in reconstructed sources. We overtrained participants on a simple instrumental reward learning task where geometric cues predicted a distribution of possible rewards, from which a sample was revealed 2000 ms later. We show that predicted mean reward (i.e., expected value), and predicted reward variability (i.e., economic risk), are encoded distinctly. Early on, representations of mean reward are seen in parietal and visual areas, and later in frontal regions with orbitofrontal cortex emerging last. Strikingly, an encoding of reward variability emerges simultaneously in parietal/sensory and frontal sources and later than mean reward encoding. An orbitofrontal variability encoding emerged around the same time as that seen for mean reward. Crucially, cross-prediction showed that mean reward and variability representations are distinct and also revealed that instantaneous representations become more stable over time. Across sources, the best fitting metric for variability signals was coefficient of variation (rather than SD or variance), but distinct best metrics were seen for individual brain regions. Our data demonstrate how a dynamic encoding of probabilistic reward prediction unfolds in the brain both in time and space. SIGNIFICANCE STATEMENT Predicting future reward is paramount to optimal behavior. To gain insight into the underlying neural computations, we investigate how reward representations in the brain arise over time. Using magnetoencephalography, we show that a representation of predicted mean reward emerges early in parietal/sensory regions and later in frontal cortex. In contrast, predicted reward variability representations appear in most regions at the same time, and slightly later than for mean reward. For both features, representations dynamically change >1000 ms before stabilizing. The best metric for encoding variability is coefficient of variation, with heterogeneity in this encoding seen between brain areas. The results provide novel insights into the emergence of predictive reward representations.

[1]  M. Leon,et al.  Conditioned tone control of brain reward behavior produces highly specific representational gain in the primary auditory cortex , 2009, Neurobiology of Learning and Memory.

[2]  Stefan Haufe,et al.  On the interpretation of weight vectors of linear models in multivariate neuroimaging , 2014, NeuroImage.

[3]  A. Pouget,et al.  Probabilistic brains: knowns and unknowns , 2013, Nature Neuroscience.

[4]  Geoffrey Schoenbaum,et al.  Risk-Responsive Orbitofrontal Neurons Track Acquired Salience , 2013, Neuron.

[5]  P. Glimcher,et al.  Annals of the New York Academy of Sciences Efficient Coding and the Neural Representation of Value , 2022 .

[6]  M. Gluck,et al.  Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning. , 2004, Journal of neurophysiology.

[7]  Timothy Edward John Behrens,et al.  Separable Learning Systems in the Macaque Brain and the Role of Orbitofrontal Cortex in Contingent Learning , 2010, Neuron.

[8]  Evan M. Gordon,et al.  Neural Signatures of Economic Preferences for Risk and Ambiguity , 2006, Neuron.

[9]  W. Schultz,et al.  Coding of Reward Risk by Orbitofrontal Neurons Is Mostly Distinct from Coding of Reward Value , 2010, Neuron.

[10]  P. Dayan,et al.  Behavioral/systems/cognitive Action Dominates Valence in Anticipatory Representations in the Human Striatum and Dopaminergic Midbrain , 2010 .

[11]  K. Berman,et al.  Cerebral Cortex doi:10.1093/cercor/bhj004 Neural Coding of Distinct Statistical Properties of Reward Information in Humans , 2005 .

[12]  P. Glimcher,et al.  Title: the Neural Representation of Subjective Value under Risk and Ambiguity 1 2 , 2009 .

[13]  R. Dolan,et al.  Sustained Magnetic Responses in Temporal Cortex Reflect Instantaneous Significance of Approaching and Receding Sounds , 2015, PloS one.

[14]  J. O'Doherty,et al.  Reward Value Coding Distinct From Risk Attitude-Related Uncertainty Coding in Human Reward Systems , 2006, Journal of neurophysiology.

[15]  Ben Seymour,et al.  Neural Activity Associated with the Passive Prediction of Ambiguity and Risk for Aversive Events , 2009, The Journal of Neuroscience.

[16]  S. Becker,et al.  What Price Ambiguity? or the Role of Ambiguity in Decision-Making , 1964, Journal of Political Economy.

[17]  E. Weber,et al.  Predicting Risk-Sensitivity in Humans and Lower Animals: Risk as Variance or Coefficient of Variation , 2004, Psychological review.

[18]  P. Bossaerts,et al.  Risk and Reward Preferences under Time Pressure , 2014 .

[19]  Jonathan W. Pillow,et al.  Dissociated functional significance of decision-related activity in the primate dorsal stream , 2016, Nature.

[20]  J. Driver,et al.  Rewarding Feedback After Correct Visual Discriminations Has Both General and Specific Influences on Visual Cortex , 2010, Journal of neurophysiology.

[21]  Hauke R. Heekeren,et al.  Neural foundations of risk–return trade-off in investment decisions , 2010, NeuroImage.

[22]  Mircea Ariel Schoenfeld,et al.  Magneto- and electroencephalographic manifestations of reward anticipation and delivery , 2012, NeuroImage.

[23]  Thomas H. B. FitzGerald,et al.  Differentiable Neural Substrates for Learned and Described Value and Risk , 2010, Current Biology.

[24]  Mark F Bear,et al.  Reward timing in the primary visual cortex. , 2006, Science.

[25]  D. Ellsberg Decision, probability, and utility: Risk, ambiguity, and the Savage axioms , 1961 .

[26]  Nick Chater,et al.  Individual decision-making , 2019, Delivering Better Policies Through Behavioural Insights.

[27]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[28]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[29]  E. Rolls,et al.  Cerebral Cortex Advance Access published June 22, 2007 Expected Value, Reward Outcome, and Temporal Difference Error Representations in a Probabilistic Decision Task , 2022 .

[30]  Tonio Ball,et al.  Causal interpretation rules for encoding and decoding models in neuroimaging , 2015, NeuroImage.

[31]  P. Tobler,et al.  Identity-specific coding of future rewards in the human orbitofrontal cortex , 2015, Proceedings of the National Academy of Sciences.

[32]  G. Vanni-Mercier,et al.  Neural dynamics of reward probability coding: a Magnetoencephalographic study in humans , 2013, Front. Neurosci..

[33]  L. Fellows,et al.  Beyond Reversal: A Critical Role for Human Orbitofrontal Cortex in Flexible Learning from Probabilistic Feedback , 2010, The Journal of Neuroscience.

[34]  R. Oostenveld,et al.  Nonparametric statistical testing of EEG- and MEG-data , 2007, Journal of Neuroscience Methods.

[35]  R. Dolan,et al.  Knowing how much you don't know: a neural organization of uncertainty estimates , 2012, Nature Reviews Neuroscience.

[36]  Karl J. Friston,et al.  Multiple sparse priors for the M/EEG inverse problem , 2008, NeuroImage.

[37]  R. Dolan,et al.  The Known Unknowns: Neural Representation of Second-Order Uncertainty, and Ambiguity , 2011, The Journal of Neuroscience.

[38]  R. Dolan,et al.  Contextual Novelty Modulates the Neural Dynamics of Reward Anticipation , 2011, The Journal of Neuroscience.

[39]  George I. Christopoulos,et al.  Neural Correlates of Value, Risk, and Risk Aversion Contributing to Decision Making under Risk , 2009, The Journal of Neuroscience.

[40]  Raymond J. Dolan,et al.  Deconstructing risk: Separable encoding of variance and skewness in the brain , 2011, NeuroImage.

[41]  S. Quartz,et al.  Neural Differentiation of Expected Reward and Risk in Human Subcortical Structures , 2006, Neuron.

[42]  Mkael Symmonds,et al.  A Behavioral and Neural Evaluation of Prospective Decision-Making under Risk , 2010, The Journal of Neuroscience.