Reinforcement learning: The Good, The Bad and The Ugly

[1]  Tomoyuki Furuyashiki,et al.  Rat Orbitofrontal Cortex Separately Encodes Response and Outcome Information during Performance of Goal-Directed Behavior , 2008, The Journal of Neuroscience.

[2]  M. Botvinick Hierarchical models of behavior and prefrontal function , 2008, Trends in Cognitive Sciences.

[3]  J. Paul Bolam,et al.  Faculty Opinions recommendation of Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. , 2008 .

[4]  Saori C. Tanaka,et al.  Low-Serotonin Levels Increase Delayed Reward Discounting in Humans , 2008, The Journal of Neuroscience.

[5]  B. Moghaddam,et al.  Differential tonic influence of lateral habenula on prefrontal cortex and nucleus accumbens dopamine release , 2008, The European journal of neuroscience.

[6]  K. Doya Modulators of decision making , 2008, Nature Neuroscience.

[7]  Timothy E. J. Behrens,et al.  Choice, uncertainty and value in prefrontal and cingulate cortex , 2008, Nature Neuroscience.

[8]  S. Lammel,et al.  Unique Properties of Mesoprefrontal Neurons within a Dual Mesocorticolimbic Dopamine System , 2008, Neuron.

[9]  Samuel M. McClure,et al.  BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area , 2008, Science.

[10]  Pearl H. Chiu,et al.  Self Responses along Cingulate Cortex Reveal Quantitative Neural Phenotype for High-Functioning Autism , 2008, Neuron.

[11]  B. Everitt,et al.  Cocaine Seeking Habits Depend upon Dopamine-Dependent Serial Connectivity Linking the Ventral with the Dorsal Striatum , 2008, Neuron.

[12]  P. Dayan,et al.  Human Pavlovian–Instrumental Transfer , 2008, The Journal of Neuroscience.

[13]  P. Glimcher,et al.  Action and Outcome Encoding in the Primate Caudate Nucleus , 2007, The Journal of Neuroscience.

[14]  Saori C. Tanaka,et al.  Serotonin Differentially Regulates Short- and Long-Term Prediction of Rewards in the Ventral and Dorsal Striatum , 2007, PloS one.

[15]  M. Reuter,et al.  Genetically Determined Differences in Learning from Errors , 2007, Science.

[16]  M. Roesch,et al.  Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[17]  C. Padoa-Schioppa Orbitofrontal Cortex and the Computation of Economic Value , 2007, Annals of the New York Academy of Sciences.

[18]  Matthijs A. A. van der Meer,et al.  Integrating hippocampus and striatum in decision-making , 2007, Current Opinion in Neurobiology.

[19]  B. Richmond,et al.  A Comparison of Reward‐Contingent Neuronal Activity in Monkey Orbitofrontal Cortex and Ventral Striatum , 2007, Annals of the New York Academy of Sciences.

[20]  Joseph J Paton,et al.  Flexible Neural Representations of Value in the Primate Brain , 2007, Annals of the New York Academy of Sciences.

[21]  P. Glimcher,et al.  The neural correlates of subjective value during intertemporal choice , 2007, Nature Neuroscience.

[22]  Peter Dayan,et al.  Serotonin, Inhibition, and Negative Mood , 2007, PLoS Comput. Biol..

[23]  Michael J. Frank,et al.  Hold Your Horses: Impulsivity, Deep Brain Stimulation, and Medication in Parkinsonism , 2007, Science.

[24]  N. Daw,et al.  Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making , 2007, The Journal of Neuroscience.

[25]  Michael Moutoussis,et al.  Persecutory delusions and the conditioned avoidance paradigm: Towards an integration of the psychology and biology of paranoia , 2007, Cognitive neuropsychiatry.

[26]  Konrad Paul Kording,et al.  Decision Theory: What "Should" the Nervous System Do? , 2007, Science.

[27]  Raymond J. Dolan,et al.  Anticipation of novelty recruits reward system and hippocampus while promoting recollection , 2007, NeuroImage.

[28]  Michael J. Frank,et al.  Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning , 2007, Proceedings of the National Academy of Sciences.

[29]  S. Kapur,et al.  Temporal Difference Modeling of the Blood-Oxygen Level Dependent Response During Aversive Conditioning in Humans: Effects of Dopaminergic Modulation , 2007, Biological Psychiatry.

[30]  P. Montague Neuroeconomics: a view from neuroscience. , 2007, Functional neurology.

[31]  Colin Camerer,et al.  Social neuroeconomics: the neural circuitry of social preferences , 2007, Trends in Cognitive Sciences.

[32]  Joseph J. Paton,et al.  Expectation Modulates Neural Responses to Pleasant and Aversive Stimuli in Primate Amygdala , 2007, Neuron.

[33]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[34]  P. Glimcher,et al.  Statistics of midbrain dopamine neuron spike trains in the awake primate. , 2007, Journal of neurophysiology.

[35]  D. Hassabis,et al.  When Fear Is Near: Threat Imminence Elicits Prefrontal-Periaqueductal Gray Shifts in Humans , 2007, Science.

[36]  R. Wightman,et al.  Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens , 2007, Nature Neuroscience.

[37]  B. Balleine,et al.  The Role of the Dorsal Striatum in Reward and Decision-Making , 2007, The Journal of Neuroscience.

[38]  J. Krakauer,et al.  Why Don't We Move Faster? Parkinson's Disease, Movement Vigor, and Implicit Motivation , 2007, The Journal of Neuroscience.

[39]  Jadin C. Jackson,et al.  Reconciling reinforcement learning models with behavioral extinction and renewal: implications for addiction, relapse, and problem gambling. , 2007, Psychological review.

[40]  O. Hikosaka,et al.  Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[41]  J. Gold,et al.  The neural basis of decision making. , 2007, Annual review of neuroscience.

[42]  A. Lüthi,et al.  Processing of Temporal Unpredictability in Human and Animal Amygdala , 2007, The Journal of Neuroscience.

[43]  R. Dolan,et al.  The human amygdala and orbital prefrontal cortex in behavioural regulation , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[44]  Kevin McCabe,et al.  Neural signature of fictive learning signals in a sequential investment task , 2007, Proceedings of the National Academy of Sciences.

[45]  Angela J. Yu,et al.  Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[46]  Samuel M. McClure,et al.  Time Discounting for Primary Rewards , 2007, The Journal of Neuroscience.

[47]  R. Dolan,et al.  How the Brain Translates Money into Force: A Neuroimaging Study of Subliminal Motivation , 2007, Science.

[48]  P. Dayan,et al.  Differential Encoding of Losses and Gains in the Human Striatum , 2007, The Journal of Neuroscience.

[49]  K. Preuschoff,et al.  Adding Prediction Risk to the Theory of Reward Learning , 2007, Annals of the New York Academy of Sciences.

[50]  O. Hikosaka Basal Ganglia Mechanisms of Reward‐Oriented Eye Movement , 2007, Annals of the New York Academy of Sciences.

[51]  Keiji Tanaka,et al.  Medial prefrontal cell activity signaling prediction errors of action values , 2007, Nature Neuroscience.

[52]  Yoav Shoham,et al.  If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..

[53]  K. Doya,et al.  Multiple Representations of Belief States and Action Values in Corticobasal Ganglia Loops , 2007, Annals of the New York Academy of Sciences.

[54]  J. O'Doherty,et al.  Model‐Based fMRI and Its Application to Reward Learning and Decision Making , 2007, Annals of the New York Academy of Sciences.

[55]  Vivian V. Valentin,et al.  Determining the Neural Substrates of Goal-Directed Learning in the Human Brain , 2007, The Journal of Neuroscience.

[56]  W. Schultz,et al.  Learning-Related Human Brain Activations Reflecting Individual Finances , 2007, Neuron.

[57]  Timothy Edward John Behrens,et al.  Triangulating a Cognitive Control Network Using Diffusion-Weighted Magnetic Resonance Imaging (MRI) and Functional MRI , 2007, The Journal of Neuroscience.

[58]  S. Kapur,et al.  Separate brain regions code for salience vs. valence during reward prediction in humans , 2007, Human brain mapping.

[59]  P. Dayan,et al.  Tonic dopamine: opportunity costs and the control of response vigor , 2007, Psychopharmacology.

[60]  P. Holland,et al.  Dissociable effects of disconnecting amygdala central nucleus from the ventral tegmental area or substantia nigra on learned orienting and incentive motivation , 2007, The European journal of neuroscience.

[61]  Keiji Tanaka,et al.  Effects of novelty on activity of lateral and medial prefrontal neurons , 2007, Neuroscience Research.

[62]  Thomas E. Hazy,et al.  PVLV: the primary value and learned value Pavlovian learning algorithm. , 2007, Behavioral neuroscience.

[63]  J. O'Doherty,et al.  Decoding the neural substrates of reward-related decision making with functional MRI , 2007, Proceedings of the National Academy of Sciences.

[64]  P. Redgrave,et al.  A direct projection from superior colliculus to substantia nigra pars compacta in the cat , 2006, Neuroscience.

[65]  P. Redgrave,et al.  Nociceptive responses of midbrain dopaminergic neurones are modulated by the superior colliculus in the rat , 2006, Neuroscience.

[66]  Samuel M. McClure,et al.  Policy Adjustment in a Dynamic Economic Game , 2006, PloS one.

[67]  J. O'Doherty,et al.  Reward Value Coding Distinct From Risk Attitude-Related Uncertainty Coding in Human Reward Systems , 2006, Journal of neurophysiology.

[68]  Á. Pascual-Leone,et al.  Diminishing Reciprocal Fairness by Disrupting the Right Prefrontal Cortex , 2006, Science.

[69]  R. Montague Why Choose This Book?: How We Make Decisions , 2006 .

[70]  W. Hauber,et al.  Dopamine D1 receptors in the anterior cingulate cortex regulate effort-based decision making. , 2006, Learning & memory.

[71]  Kenji Doya,et al.  Humans Can Adopt Optimal Discounting Strategy under Real-Time Constraints , 2006, PLoS Comput. Biol..

[72]  Jonathan D. Cohen,et al.  The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. , 2006, Psychological review.

[73]  Michael J. Frank,et al.  Hold your horses: A dynamic computational role for the subthalamic nucleus in decision making , 2006, Neural Networks.

[74]  Peter Dayan,et al.  Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[75]  Kenji Doya,et al.  Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics , 2006, Neural Networks.

[76]  R. Dolan,et al.  Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans , 2006, Nature.

[77]  M. Roesch,et al.  Encoding of Time-Discounted Rewards in Orbitofrontal Cortex Is Independent of Value Representation , 2006, Neuron.

[78]  P. Gean,et al.  The Role of the Amygdala in the Extinction of Conditioned Fear , 2006, Biological Psychiatry.

[79]  D. Kumaran,et al.  Frames, Biases, and Rational Decision-Making in the Human Brain , 2006, Science.

[80]  N. Bunzeck,et al.  Absolute Coding of Stimulus Novelty in the Human Substantia Nigra/VTA , 2006, Neuron.

[81]  S. Quartz,et al.  Neural Differentiation of Expected Reward and Risk in Human Subcortical Structures , 2006, Neuron.

[82]  E. Vaadia,et al.  Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[83]  P. Dayan,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.8 Full text provided by www.sciencedirect.com A normative perspective on motivation , 2022 .

[84]  J. O'Doherty,et al.  Is Avoiding an Aversive Outcome Rewarding? Neural Substrates of Avoidance Learning in the Human Brain , 2006, PLoS biology.

[85]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[86]  S. Ishii,et al.  Resolution of Uncertainty in Prefrontal Cortex , 2006, Neuron.

[87]  P. Holland,et al.  Role of Substantia Nigra–Amygdala Connections in Surprise-Induced Enhancement of Attention , 2006, The Journal of Neuroscience.

[88]  Kae Nakamura,et al.  Role of Dopamine in the Primate Caudate Nucleus in Reward Modulation of Saccades , 2006, The Journal of Neuroscience.

[89]  P. Holland,et al.  Different Roles for Amygdala Central Nucleus and Substantia Innominata in the Surprise-Induced Enhancement of Learning , 2006, The Journal of Neuroscience.

[90]  K. Doya,et al.  The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.

[91]  Daeyeol Lee Neural basis of quasi-rational decision making , 2006, Current Opinion in Neurobiology.

[92]  W. Hauber,et al.  Inactivation of the ventral tegmental area abolished the general excitatory influence of Pavlovian cues on instrumental performance. , 2006, Learning & memory.

[93]  Joseph J. Paton,et al.  The primate amygdala represents the positive and negative value of visual stimuli during learning , 2006, Nature.

[94]  Michael J. Frank,et al.  Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.

[95]  Kae Nakamura,et al.  Basal ganglia orient eyes to reward. , 2006, Journal of neurophysiology.

[96]  J. O'Doherty,et al.  Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum , 2006, Neuron.

[97]  B. Balleine Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits , 2005, Physiology & Behavior.

[98]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[99]  J. O'Doherty,et al.  Human Neural Learning Depends on Reward Prediction Errors in the Blocking Paradigm , 2005, Journal of neurophysiology.

[100]  Camelia M. Kuhnen,et al.  The Neural Basis of Financial Risk Taking , 2005, Neuron.

[101]  J. O'Doherty,et al.  Regret and its avoidance: a neuroimaging study of choice behavior , 2005, Nature Neuroscience.

[102]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[103]  W. Pan,et al.  Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network , 2005, The Journal of Neuroscience.

[104]  Wolfgang Hauber,et al.  Involvement of the rat anterior cingulate cortex in control of instrumental responses guided by reward expectancy. , 2005, Learning & memory.

[105]  W. Schultz,et al.  Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[106]  M. Walton,et al.  The mesocortical dopamine projection to anterior cingulate cortex plays no role in guiding effort-related decisions. , 2005, Behavioral neuroscience.

[107]  Michael J. Frank,et al.  By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[108]  Samuel M. McClure,et al.  Separate Neural Systems Value Immediate and Delayed Monetary Rewards , 2004, Science.

[109]  Munetaka Shidara,et al.  Differential encoding of information about progress through multi-trial reward schedules by three groups of ventral striatal neurons , 2004, Neuroscience Research.

[110]  Peter Dayan,et al.  Temporal difference models describe higher-order learning in humans , 2004, Nature.

[111]  P. Corr,et al.  A two-dimensional neuropsychology of defense: fear/anxiety and defensive distance , 2004, Neuroscience & Biobehavioral Reviews.

[112]  Keiji Tanaka,et al.  The role of the medial prefrontal cortex in achieving goals , 2004, Current Opinion in Neurobiology.

[113]  J. Bolam,et al.  Uniform Inhibition of Dopamine Neurons in the Ventral Tegmental Area by Aversive Stimuli , 2004, Science.

[114]  P. Montague,et al.  Dynamic Gain Control of Dopamine Delivery in Freely Moving Animals , 2004, The Journal of Neuroscience.

[115]  K. Doya,et al.  A Neural Correlate of Reward-Based Behavioral Learning in Caudate Nucleus: A Functional Magnetic Resonance Imaging Study of a Stochastic Decision Task , 2004, The Journal of Neuroscience.

[116]  Nikos K Logothetis,et al.  Interpreting the BOLD signal. , 2004, Annual review of physiology.

[117]  B. Kolb,et al.  Do rats have a prefrontal cortex? , 2003, Behavioural Brain Research.

[118]  W. Schultz,et al.  Coding of Predicted Reward Omission by Dopamine Neurons in a Conditioned Inhibition Paradigm , 2003, The Journal of Neuroscience.

[119]  A. Bechara Decisions, Uncertainty, and the Brain: The Science of Neuroeconomics , 2003 .

[120]  Matthew F S Rushworth,et al.  Functional Specialization within Medial Frontal Cortex of the Anterior Cingulate for Evaluating Effort-Related Decisions , 2003, The Journal of Neuroscience.

[121]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[122]  S. Killcross,et al.  Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[123]  Colin Camerer Behavioral Game Theory: Experiments in Strategic Interaction , 2003 .

[124]  J. Salamone,et al.  Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine , 2002, Behavioural Brain Research.

[125]  M. Bouton Context, ambiguity, and unlearning: sources of relapse after behavioral extinction , 2002, Biological Psychiatry.

[126]  R. Byrne Mental models and counterfactual thoughts about what might have been , 2002, Trends in Cognitive Sciences.

[127]  B. Hyland,et al.  Firing modes of midbrain dopamine cells in the freely moving rat , 2002, Neuroscience.

[128]  K. Berridge,et al.  Positive and Negative Motivation in Nucleus Accumbens Shell: Bivalent Rostrocaudal Gradients for GABA-Elicited Eating, Taste “Liking”/“Disliking” Reactions, Place Preference/Avoidance, and Fear , 2002, The Journal of Neuroscience.

[129]  H. Pashler STEVENS' HANDBOOK OF EXPERIMENTAL PSYCHOLOGY , 2002 .

[130]  Sham M. Kakade,et al.  Opponent interactions between serotonin and dopamine , 2002, Neural Networks.

[131]  Peter Dayan,et al.  Dopamine: generalization and bonuses , 2002, Neural Networks.

[132]  M. El-Sabaawi Breakdown of Will , 2002 .

[133]  Isaac Meilijson,et al.  Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002, Adapt. Behav..

[134]  Peter L. Bartlett,et al.  Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[135]  K. Berridge,et al.  Fear and Feeding in the Nucleus Accumbens Shell: Rostrocaudal Segregation of GABA-Elicited Defensive Behavior Versus Eating Behavior , 2001, The Journal of Neuroscience.

[136]  D. Kahneman,et al.  Functional Imaging of Neural Responses to Expectancy and Experience of Monetary Gains and Losses tasks with monetary payoffs , 2001 .

[137]  Samuel M. McClure,et al.  Predictability Modulates Human Brain Response to Reward , 2001, The Journal of Neuroscience.

[138]  L. Nystrom,et al.  Tracking the hemodynamic responses to reward and punishment in the striatum. , 2000, Journal of neurophysiology.

[139]  Nikolaus R. McFarland,et al.  Striatonigrostriatal Pathways in Primates Form an Ascending Spiral from the Shell to the Dorsolateral Striatum , 2000, The Journal of Neuroscience.

[140]  D. Joel,et al.  The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum , 2000, Neuroscience.

[141]  P. Holland,et al.  Amygdala circuitry in attentional and representational processes , 1999, Trends in Cognitive Sciences.

[142]  Ralph Neuneier,et al.  Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.

[143]  C. Gallistel,et al.  Toward a neurobiology of temporal cognition: advances and challenges , 1997, Current Opinion in Neurobiology.

[144]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[145]  F. Graeff,et al.  Role of 5-HT in stress, anxiety, and depression , 1996, Pharmacology Biochemistry and Behavior.

[146]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[147]  G. Loewenstein,et al.  Anomalies in Intertemporal Choice: Evidence and an Interpretation , 1992 .

[148]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[149]  P. Soubrié Reconciling the role of central serotonin neurons in human and animal behavior , 1986, Behavioral and Brain Sciences.

[150]  R. Sugden,et al.  Regret Theory: An alternative theory of rational choice under uncertainty Review of Economic Studies , 1982 .

[151]  David E. Bell,et al.  Regret in Decision Making under Uncertainty , 1982, Oper. Res..

[152]  K. Breland,et al.  The misbehavior of organisms. , 1961 .

[153]  P. Montague,et al.  Theoretical and Empirical Studies of Learning , 2009 .

[154]  Colin Camerer,et al.  Neuroeconomics: decision making and the brain , 2008 .

[155]  R. O’Reilly,et al.  Separate neural substrates for skill learning and performance in the ventral and dorsal striatum , 2007, Nature Neuroscience.

[156]  R. Poldrack,et al.  Cortical and Subcortical Contributions to Stop Signal Response Inhibition: Role of the Subthalamic Nucleus , 2006, The Journal of Neuroscience.

[157]  K. Berman,et al.  Cerebral Cortex doi:10.1093/cercor/bhj004 Neural Coding of Distinct Statistical Properties of Reward Information in Humans , 2005 .

[158]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[159]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[160]  E. Rolls,et al.  Abstract reward and punishment representations in the human orbitofrontal cortex , 2001, Nature Neuroscience.

[161]  A. Barto,et al.  Adaptive Critics and the Basal Ganglia , 1994 .

[162]  Joel L. Davis,et al.  Adaptive Critics and the Basal Ganglia , 1995 .

[163]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[164]  D. Blanchard,et al.  Ethoexperimental approaches to the biology of emotion. , 1988, Annual review of psychology.