Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons

Uncertainty is critical in the measure of information and in assessing the accuracy of predictions. It is determined by probability P, being maximal at P = 0.5 and decreasing at higher and lower probabilities. Using distinct stimuli to indicate the probability of reward, we found that the phasic activation of dopamine neurons varied monotonically across the full range of probabilities, supporting past claims that this response codes the discrepancy between predicted and actual reward. In contrast, a previously unobserved response covaried with uncertainty and consisted of a gradual increase in activity until the potential time of reward. The coding of uncertainty suggests a possible role for dopamine signals in attention-based learning and risk-taking behavior.

[1]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  N. Mackintosh A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement , 1975 .

[3]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[4]  J. Pearce,et al.  A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. , 1980 .

[5]  S. Lea,et al.  Contemporary Animal Learning Theory, Anthony Dickinson. Cambridge University Press, Cambridge (1981), xii, +177 pp. £12.50 hardback, £3.95 paperback , 1981 .

[6]  J. Pearce,et al.  The strength of the orienting response during Pavlovian conditioning. , 1984, Journal of experimental psychology. Animal behavior processes.

[7]  J. Pearce Introduction to animal cognition , 1987 .

[8]  永福 智志 The Organization of Learning , 2005, Journal of Cognitive Neuroscience.

[9]  AC Tose Cell , 1993, Cell.

[10]  W. Schultz,et al.  Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[11]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[12]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[13]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[14]  W. Schultz Predictive reward signal of dopamine neurons. , 1998, Journal of neurophysiology.

[15]  M. Stamp Dawkins,et al.  Cognitive ecology: the evolutionary ecology of information processing and decision making. , 1998, Trends in cognitive sciences.

[16]  R. Dukas,et al.  Cognitive ecology : the evolutionary ecology of information processing and decision making , 1998 .

[17]  P. Redgrave,et al.  Is the short-latency dopamine response too short to signal reward error? , 1999, Trends in Neurosciences.

[18]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[19]  J. Ashby References and Notes , 1999 .

[20]  C. Gallistel,et al.  Time, rate, and conditioning. , 2000, Psychological review.

[21]  W. Singer,et al.  Dynamic predictions: Oscillations and synchrony in top–down processing , 2001, Nature Reviews Neuroscience.

[22]  Adrienne L. Fairhall,et al.  Efficiency and ambiguity in an adaptive neural code , 2001, Nature.

[23]  D. Wilkin,et al.  Neuron , 2001, Brain Research.

[24]  W. Schultz,et al.  Dopamine responses comply with basic assumptions of formal learning theory , 2001, Nature.

[25]  J. E. Mazur,et al.  Hyperbolic value addition and general models of animal choice. , 2001, Psychological review.

[26]  D. Long Probabilistic Models of the Brain. , 2002 .

[27]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[28]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.