论文信息 - Reinforcement learning: The Good, The Bad and The Ugly - 字舞流文

Reinforcement learning: The Good, The Bad and The Ugly

P. Dayan | Y. Niv

[1] Tomoyuki Furuyashiki,et al. Rat Orbitofrontal Cortex Separately Encodes Response and Outcome Information during Performance of Goal-Directed Behavior , 2008, The Journal of Neuroscience.

[2] M. Botvinick. Hierarchical models of behavior and prefrontal function , 2008, Trends in Cognitive Sciences.

[3] J. Paul Bolam,et al. Faculty Opinions recommendation of Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. , 2008 .

[4] Saori C. Tanaka,et al. Low-Serotonin Levels Increase Delayed Reward Discounting in Humans , 2008, The Journal of Neuroscience.

[5] B. Moghaddam,et al. Differential tonic influence of lateral habenula on prefrontal cortex and nucleus accumbens dopamine release , 2008, The European journal of neuroscience.

[6] K. Doya. Modulators of decision making , 2008, Nature Neuroscience.

[7] Timothy E. J. Behrens,et al. Choice, uncertainty and value in prefrontal and cingulate cortex , 2008, Nature Neuroscience.

[8] S. Lammel,et al. Unique Properties of Mesoprefrontal Neurons within a Dual Mesocorticolimbic Dopamine System , 2008, Neuron.

[9] Samuel M. McClure,et al. BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area , 2008, Science.

[10] Pearl H. Chiu,et al. Self Responses along Cingulate Cortex Reveal Quantitative Neural Phenotype for High-Functioning Autism , 2008, Neuron.

[11] B. Everitt,et al. Cocaine Seeking Habits Depend upon Dopamine-Dependent Serial Connectivity Linking the Ventral with the Dorsal Striatum , 2008, Neuron.

[12] P. Dayan,et al. Human Pavlovian–Instrumental Transfer , 2008, The Journal of Neuroscience.

[13] P. Glimcher,et al. Action and Outcome Encoding in the Primate Caudate Nucleus , 2007, The Journal of Neuroscience.

[14] Saori C. Tanaka,et al. Serotonin Differentially Regulates Short- and Long-Term Prediction of Rewards in the Ventral and Dorsal Striatum , 2007, PloS one.

[15] M. Reuter,et al. Genetically Determined Differences in Learning from Errors , 2007, Science.

[16] M. Roesch,et al. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[17] C. Padoa-Schioppa. Orbitofrontal Cortex and the Computation of Economic Value , 2007, Annals of the New York Academy of Sciences.

[18] Matthijs A. A. van der Meer,et al. Integrating hippocampus and striatum in decision-making , 2007, Current Opinion in Neurobiology.

[19] B. Richmond,et al. A Comparison of Reward‐Contingent Neuronal Activity in Monkey Orbitofrontal Cortex and Ventral Striatum , 2007, Annals of the New York Academy of Sciences.

[20] Joseph J Paton,et al. Flexible Neural Representations of Value in the Primate Brain , 2007, Annals of the New York Academy of Sciences.

[21] P. Glimcher,et al. The neural correlates of subjective value during intertemporal choice , 2007, Nature Neuroscience.

[22] Peter Dayan,et al. Serotonin, Inhibition, and Negative Mood , 2007, PLoS Comput. Biol..

[23] Michael J. Frank,et al. Hold Your Horses: Impulsivity, Deep Brain Stimulation, and Medication in Parkinsonism , 2007, Science.

[24] N. Daw,et al. Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making , 2007, The Journal of Neuroscience.

[25] Michael Moutoussis,et al. Persecutory delusions and the conditioned avoidance paradigm: Towards an integration of the psychology and biology of paranoia , 2007, Cognitive neuropsychiatry.

[26] Konrad Paul Kording,et al. Decision Theory: What "Should" the Nervous System Do? , 2007, Science.

[27] Raymond J. Dolan,et al. Anticipation of novelty recruits reward system and hippocampus while promoting recollection , 2007, NeuroImage.

[28] Michael J. Frank,et al. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning , 2007, Proceedings of the National Academy of Sciences.

[29] S. Kapur,et al. Temporal Difference Modeling of the Blood-Oxygen Level Dependent Response During Aversive Conditioning in Humans: Effects of Dopaminergic Modulation , 2007, Biological Psychiatry.

[30] P. Montague. Neuroeconomics: a view from neuroscience. , 2007, Functional neurology.

[31] Colin Camerer,et al. Social neuroeconomics: the neural circuitry of social preferences , 2007, Trends in Cognitive Sciences.

[32] Joseph J. Paton,et al. Expectation Modulates Neural Responses to Pleasant and Aversive Stimuli in Primate Amygdala , 2007, Neuron.

[33] Timothy E. J. Behrens,et al. Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[34] P. Glimcher,et al. Statistics of midbrain dopamine neuron spike trains in the awake primate. , 2007, Journal of neurophysiology.

[35] D. Hassabis,et al. When Fear Is Near: Threat Imminence Elicits Prefrontal-Periaqueductal Gray Shifts in Humans , 2007, Science.

[36] R. Wightman,et al. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens , 2007, Nature Neuroscience.

[37] B. Balleine,et al. The Role of the Dorsal Striatum in Reward and Decision-Making , 2007, The Journal of Neuroscience.

[38] J. Krakauer,et al. Why Don't We Move Faster? Parkinson's Disease, Movement Vigor, and Implicit Motivation , 2007, The Journal of Neuroscience.

[39] Jadin C. Jackson,et al. Reconciling reinforcement learning models with behavioral extinction and renewal: implications for addiction, relapse, and problem gambling. , 2007, Psychological review.

[40] O. Hikosaka,et al. Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[41] J. Gold,et al. The neural basis of decision making. , 2007, Annual review of neuroscience.

[42] A. Lüthi,et al. Processing of Temporal Unpredictability in Human and Animal Amygdala , 2007, The Journal of Neuroscience.

[43] R. Dolan,et al. The human amygdala and orbital prefrontal cortex in behavioural regulation , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[44] Kevin McCabe,et al. Neural signature of fictive learning signals in a sequential investment task , 2007, Proceedings of the National Academy of Sciences.

[45] Angela J. Yu,et al. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[46] Samuel M. McClure,et al. Time Discounting for Primary Rewards , 2007, The Journal of Neuroscience.

[47] R. Dolan,et al. How the Brain Translates Money into Force: A Neuroimaging Study of Subliminal Motivation , 2007, Science.

[48] P. Dayan,et al. Differential Encoding of Losses and Gains in the Human Striatum , 2007, The Journal of Neuroscience.

[49] K. Preuschoff,et al. Adding Prediction Risk to the Theory of Reward Learning , 2007, Annals of the New York Academy of Sciences.

[50] O. Hikosaka. Basal Ganglia Mechanisms of Reward‐Oriented Eye Movement , 2007, Annals of the New York Academy of Sciences.

[51] Keiji Tanaka,et al. Medial prefrontal cell activity signaling prediction errors of action values , 2007, Nature Neuroscience.

[52] Yoav Shoham,et al. If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..

[53] K. Doya,et al. Multiple Representations of Belief States and Action Values in Corticobasal Ganglia Loops , 2007, Annals of the New York Academy of Sciences.

[54] J. O'Doherty,et al. Model‐Based fMRI and Its Application to Reward Learning and Decision Making , 2007, Annals of the New York Academy of Sciences.

[55] Vivian V. Valentin,et al. Determining the Neural Substrates of Goal-Directed Learning in the Human Brain , 2007, The Journal of Neuroscience.

[56] W. Schultz,et al. Learning-Related Human Brain Activations Reflecting Individual Finances , 2007, Neuron.

[57] Timothy Edward John Behrens,et al. Triangulating a Cognitive Control Network Using Diffusion-Weighted Magnetic Resonance Imaging (MRI) and Functional MRI , 2007, The Journal of Neuroscience.

[58] S. Kapur,et al. Separate brain regions code for salience vs. valence during reward prediction in humans , 2007, Human brain mapping.

[59] P. Dayan,et al. Tonic dopamine: opportunity costs and the control of response vigor , 2007, Psychopharmacology.

[60] P. Holland,et al. Dissociable effects of disconnecting amygdala central nucleus from the ventral tegmental area or substantia nigra on learned orienting and incentive motivation , 2007, The European journal of neuroscience.

[61] Keiji Tanaka,et al. Effects of novelty on activity of lateral and medial prefrontal neurons , 2007, Neuroscience Research.

[62] Thomas E. Hazy,et al. PVLV: the primary value and learned value Pavlovian learning algorithm. , 2007, Behavioral neuroscience.

[63] J. O'Doherty,et al. Decoding the neural substrates of reward-related decision making with functional MRI , 2007, Proceedings of the National Academy of Sciences.

[64] P. Redgrave,et al. A direct projection from superior colliculus to substantia nigra pars compacta in the cat , 2006, Neuroscience.

[65] P. Redgrave,et al. Nociceptive responses of midbrain dopaminergic neurones are modulated by the superior colliculus in the rat , 2006, Neuroscience.

[66] Samuel M. McClure,et al. Policy Adjustment in a Dynamic Economic Game , 2006, PloS one.

[67] J. O'Doherty,et al. Reward Value Coding Distinct From Risk Attitude-Related Uncertainty Coding in Human Reward Systems , 2006, Journal of neurophysiology.

[68] Á. Pascual-Leone,et al. Diminishing Reciprocal Fairness by Disrupting the Right Prefrontal Cortex , 2006, Science.

[69] R. Montague. Why Choose This Book?: How We Make Decisions , 2006 .

[70] W. Hauber,et al. Dopamine D1 receptors in the anterior cingulate cortex regulate effort-based decision making. , 2006, Learning & memory.

[71] Kenji Doya,et al. Humans Can Adopt Optimal Discounting Strategy under Real-Time Constraints , 2006, PLoS Comput. Biol..

[72] Jonathan D. Cohen,et al. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. , 2006, Psychological review.

[73] Michael J. Frank,et al. Hold your horses: A dynamic computational role for the subthalamic nucleus in decision making , 2006, Neural Networks.

[74] Peter Dayan,et al. Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[75] Kenji Doya,et al. Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics , 2006, Neural Networks.

[76] R. Dolan,et al. Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans , 2006, Nature.

[77] M. Roesch,et al. Encoding of Time-Discounted Rewards in Orbitofrontal Cortex Is Independent of Value Representation , 2006, Neuron.

[78] P. Gean,et al. The Role of the Amygdala in the Extinction of Conditioned Fear , 2006, Biological Psychiatry.

[79] D. Kumaran,et al. Frames, Biases, and Rational Decision-Making in the Human Brain , 2006, Science.

[80] N. Bunzeck,et al. Absolute Coding of Stimulus Novelty in the Human Substantia Nigra/VTA , 2006, Neuron.

[81] S. Quartz,et al. Neural Differentiation of Expected Reward and Risk in Human Subcortical Structures , 2006, Neuron.

[82] E. Vaadia,et al. Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[83] P. Dayan,et al. Opinion TRENDS in Cognitive Sciences Vol.10 No.8 Full text provided by www.sciencedirect.com A normative perspective on motivation , 2022 .

[84] J. O'Doherty,et al. Is Avoiding an Aversive Outcome Rewarding? Neural Substrates of Avoidance Learning in the Human Brain , 2006, PLoS biology.

[85] P. Dayan,et al. Cortical substrates for exploratory decisions in humans , 2006, Nature.

[86] S. Ishii,et al. Resolution of Uncertainty in Prefrontal Cortex , 2006, Neuron.

[87] P. Holland,et al. Role of Substantia Nigra–Amygdala Connections in Surprise-Induced Enhancement of Attention , 2006, The Journal of Neuroscience.

[88] Kae Nakamura,et al. Role of Dopamine in the Primate Caudate Nucleus in Reward Modulation of Saccades , 2006, The Journal of Neuroscience.

[89] P. Holland,et al. Different Roles for Amygdala Central Nucleus and Substantia Innominata in the Surprise-Induced Enhancement of Learning , 2006, The Journal of Neuroscience.

[90] K. Doya,et al. The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.

[91] Daeyeol Lee. Neural basis of quasi-rational decision making , 2006, Current Opinion in Neurobiology.

[92] W. Hauber,et al. Inactivation of the ventral tegmental area abolished the general excitatory influence of Pavlovian cues on instrumental performance. , 2006, Learning & memory.

[93] Joseph J. Paton,et al. The primate amygdala represents the positive and negative value of visual stimuli during learning , 2006, Nature.

[94] Michael J. Frank,et al. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.

[95] Kae Nakamura,et al. Basal ganglia orient eyes to reward. , 2006, Journal of neurophysiology.

[96] J. O'Doherty,et al. Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum , 2006, Neuron.

[97] B. Balleine. Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits , 2005, Physiology & Behavior.

[98] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[99] J. O'Doherty,et al. Human Neural Learning Depends on Reward Prediction Errors in the Blocking Paradigm , 2005, Journal of neurophysiology.

[100] Camelia M. Kuhnen,et al. The Neural Basis of Financial Risk Taking , 2005, Neuron.

[101] J. O'Doherty,et al. Regret and its avoidance: a neuroimaging study of choice behavior , 2005, Nature Neuroscience.

[102] P. J. Gmytrasiewicz,et al. A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[103] W. Pan,et al. Dopamine Cells Respond to Predicted Events during Classical Conditioning: Evidence for Eligibility Traces in the Reward-Learning Network , 2005, The Journal of Neuroscience.

[104] Wolfgang Hauber,et al. Involvement of the rat anterior cingulate cortex in control of instrumental responses guided by reward expectancy. , 2005, Learning & memory.

[105] W. Schultz,et al. Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[106] M. Walton,et al. The mesocortical dopamine projection to anterior cingulate cortex plays no role in guiding effort-related decisions. , 2005, Behavioral neuroscience.

[107] Michael J. Frank,et al. By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[108] Samuel M. McClure,et al. Separate Neural Systems Value Immediate and Delayed Monetary Rewards , 2004, Science.

[109] Munetaka Shidara,et al. Differential encoding of information about progress through multi-trial reward schedules by three groups of ventral striatal neurons , 2004, Neuroscience Research.

[110] Peter Dayan,et al. Temporal difference models describe higher-order learning in humans , 2004, Nature.

[111] P. Corr,et al. A two-dimensional neuropsychology of defense: fear/anxiety and defensive distance , 2004, Neuroscience & Biobehavioral Reviews.

[112] Keiji Tanaka,et al. The role of the medial prefrontal cortex in achieving goals , 2004, Current Opinion in Neurobiology.

[113] J. Bolam,et al. Uniform Inhibition of Dopamine Neurons in the Ventral Tegmental Area by Aversive Stimuli , 2004, Science.

[114] P. Montague,et al. Dynamic Gain Control of Dopamine Delivery in Freely Moving Animals , 2004, The Journal of Neuroscience.

[115] K. Doya,et al. A Neural Correlate of Reward-Based Behavioral Learning in Caudate Nucleus: A Functional Magnetic Resonance Imaging Study of a Stochastic Decision Task , 2004, The Journal of Neuroscience.

[116] Nikos K Logothetis,et al. Interpreting the BOLD signal. , 2004, Annual review of physiology.

[117] B. Kolb,et al. Do rats have a prefrontal cortex? , 2003, Behavioural Brain Research.

[118] W. Schultz,et al. Coding of Predicted Reward Omission by Dopamine Neurons in a Conditioned Inhibition Paradigm , 2003, The Journal of Neuroscience.

[119] A. Bechara. Decisions, Uncertainty, and the Brain: The Science of Neuroeconomics , 2003 .

[120] Matthew F S Rushworth,et al. Functional Specialization within Medial Frontal Cortex of the Anterior Cingulate for Evaluating Effort-Related Decisions , 2003, The Journal of Neuroscience.

[121] Karl J. Friston,et al. Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[122] S. Killcross,et al. Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[123] Colin Camerer. Behavioral Game Theory: Experiments in Strategic Interaction , 2003 .

[124] J. Salamone,et al. Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine , 2002, Behavioural Brain Research.

[125] M. Bouton. Context, ambiguity, and unlearning: sources of relapse after behavioral extinction , 2002, Biological Psychiatry.

[126] R. Byrne. Mental models and counterfactual thoughts about what might have been , 2002, Trends in Cognitive Sciences.

[127] B. Hyland,et al. Firing modes of midbrain dopamine cells in the freely moving rat , 2002, Neuroscience.

[128] K. Berridge,et al. Positive and Negative Motivation in Nucleus Accumbens Shell: Bivalent Rostrocaudal Gradients for GABA-Elicited Eating, Taste “Liking”/“Disliking” Reactions, Place Preference/Avoidance, and Fear , 2002, The Journal of Neuroscience.

[129] H. Pashler. STEVENS' HANDBOOK OF EXPERIMENTAL PSYCHOLOGY , 2002 .

[130] Sham M. Kakade,et al. Opponent interactions between serotonin and dopamine , 2002, Neural Networks.

[131] Peter Dayan,et al. Dopamine: generalization and bonuses , 2002, Neural Networks.

[132] M. El-Sabaawi. Breakdown of Will , 2002 .

[133] Isaac Meilijson,et al. Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002, Adapt. Behav..

[134] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[135] K. Berridge,et al. Fear and Feeding in the Nucleus Accumbens Shell: Rostrocaudal Segregation of GABA-Elicited Defensive Behavior Versus Eating Behavior , 2001, The Journal of Neuroscience.

[136] D. Kahneman,et al. Functional Imaging of Neural Responses to Expectancy and Experience of Monetary Gains and Losses tasks with monetary payoffs , 2001 .

[137] Samuel M. McClure,et al. Predictability Modulates Human Brain Response to Reward , 2001, The Journal of Neuroscience.

[138] L. Nystrom,et al. Tracking the hemodynamic responses to reward and punishment in the striatum. , 2000, Journal of neurophysiology.

[139] Nikolaus R. McFarland,et al. Striatonigrostriatal Pathways in Primates Form an Ascending Spiral from the Shell to the Dorsolateral Striatum , 2000, The Journal of Neuroscience.

[140] D. Joel,et al. The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum , 2000, Neuroscience.

[141] P. Holland,et al. Amygdala circuitry in attentional and representational processes , 1999, Trends in Cognitive Sciences.

[142] Ralph Neuneier,et al. Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.

[143] C. Gallistel,et al. Toward a neurobiology of temporal cognition: advances and challenges , 1997, Current Opinion in Neurobiology.

[144] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.

[145] F. Graeff,et al. Role of 5-HT in stress, anxiety, and depression , 1996, Pharmacology Biochemistry and Behavior.

[146] P. Dayan,et al. A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[147] G. Loewenstein,et al. Anomalies in Intertemporal Choice: Evidence and an Interpretation , 1992 .

[148] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[149] P. Soubrié. Reconciling the role of central serotonin neurons in human and animal behavior , 1986, Behavioral and Brain Sciences.

[150] R. Sugden,et al. Regret Theory: An alternative theory of rational choice under uncertainty Review of Economic Studies , 1982 .

[151] David E. Bell,et al. Regret in Decision Making under Uncertainty , 1982, Oper. Res..

[152] K. Breland,et al. The misbehavior of organisms. , 1961 .

[153] P. Montague,et al. Theoretical and Empirical Studies of Learning , 2009 .

[154] Colin Camerer,et al. Neuroeconomics: decision making and the brain , 2008 .

[155] R. O’Reilly,et al. Separate neural substrates for skill learning and performance in the ventral and dorsal striatum , 2007, Nature Neuroscience.

[156] R. Poldrack,et al. Cortical and Subcortical Contributions to Stop Signal Response Inhibition: Role of the Subthalamic Nucleus , 2006, The Journal of Neuroscience.

[157] K. Berman,et al. Cerebral Cortex doi:10.1093/cercor/bhj004 Neural Coding of Distinct Statistical Properties of Reward Information in Humans , 2005 .

[158] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[159] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[160] E. Rolls,et al. Abstract reward and punishment representations in the human orbitofrontal cortex , 2001, Nature Neuroscience.

[161] A. Barto,et al. Adaptive Critics and the Basal Ganglia , 1994 .

[162] Joel L. Davis,et al. Adaptive Critics and the Basal Ganglia , 1995 .

[163] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[164] D. Blanchard,et al. Ethoexperimental approaches to the biology of emotion. , 1988, Annual review of psychology.