论文信息 - Psychological and Neuroscientific Connections with Reinforcement Learning

Psychological and Neuroscientific Connections with Reinforcement Learning

The field of Reinforcement Learning (RL) was inspired in large part by research in animal behavior and psychology. Early research showed that animals can, through trial and error, learn to execute behavior that would eventually lead to some (presumably satisfactory) outcome, and decades of subsequent research was (and is still) aimed at discovering the mechanisms of this learning process. This chapter describes behavioral and theoretical research in animal learning that is directly related to fundamental concepts used in RL. It then describes neuroscientific research that suggests that animals and many RL algorithms use very similar learning mechanisms. Along the way, I highlight ways that research in computer science contributes to and can be inspired by research in psychology and neuroscience.

Ashvin Shah | Ashvin Shah

[1] JOHN W. Moore. A Neuroscientist's Guide to Classical Conditioning , 2002 .

[2] Karl J. Friston,et al. Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[3] Jung Hoon Sul,et al. Role of Striatum in Updating Values of Chosen Actions , 2009, The Journal of Neuroscience.

[4] Florentin Wörgötter,et al. Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms , 2005, Neural Computation.

[5] Joseph E LeDoux,et al. Contributions of the Amygdala to Emotion Processing: From Animal Models to Human Behavior , 2005, Neuron.

[6] Jonathan D. Cohen,et al. Computational roles for dopamine in behavioural control , 2004, Nature.

[7] Tatsuo K Sato,et al. Correlated Coding of Motivation and Outcome of Decision by Dopamine Neurons , 2003, The Journal of Neuroscience.

[8] W. Schultz,et al. Responses of monkey dopamine neurons during learning of behavioral reactions. , 1992, Journal of neurophysiology.

[9] A. Graybiel,et al. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories , 2005, Nature.

[10] J C Houk,et al. Action selection and refinement in subcortical loops through basal ganglia and cerebellum , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[11] R. Hienz,et al. Shaping the location of a pigeon's peck: effect of rate and size of shaping steps. , 1980, Journal of the experimental analysis of behavior.

[12] Kevin N. Gurney,et al. Reverse Engineering the Vertebrate Brain: Methodological Principles for a Biologically Grounded Programme of Cognitive Modelling , 2009, Cognitive Computation.

[13] P. Strick,et al. Skill representation in the primary motor cortex after long-term practice. , 2007, Journal of neurophysiology.

[14] M. Roesch,et al. Ventral Striatal Neurons Encode the Value of the Chosen Action in Rats Deciding between Differently Delayed or Sized Rewards , 2009, The Journal of Neuroscience.

[15] D. Shanks,et al. A Re-examination of Probability Matching and Rational Choice , 2002 .

[16] A. Graybiel. Habits, rituals, and the evaluative brain. , 2008, Annual review of neuroscience.

[17] Michael J. Frank,et al. Dynamic Dopamine Modulation in the Basal Ganglia: A Neurocomputational Account of Cognitive Deficits in Medicated and Nonmedicated Parkinsonism , 2005, Journal of Cognitive Neuroscience.

[18] B. Balleine,et al. Human and Rodent Homologies in Action Control: Corticostriatal Determinants of Goal-Directed and Habitual Action , 2010, Neuropsychopharmacology.

[19] W K Richardson,et al. Stimulus stringing by pigeons. , 1981, Journal of the experimental analysis of behavior.

[20] S P Wise,et al. Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action. , 1995, Cerebral cortex.

[21] H. Bergman,et al. Goal-directed and habitual control in the basal ganglia: implications for Parkinson's disease , 2010, Nature Reviews Neuroscience.

[22] W. Schultz,et al. Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[23] P. Glimcher,et al. Neuroeconomics: The Consilience of Brain and Decision , 2004, Science.

[24] W. Schultz. Dopamine signals for reward value and risk: basic and recent data , 2010, Behavioral and Brain Functions.

[25] W. Schultz,et al. Dopamine responses comply with basic assumptions of formal learning theory , 2001, Nature.

[26] R. O’Reilly,et al. Separate neural substrates for skill learning and performance in the ventral and dorsal striatum , 2007, Nature Neuroscience.

[27] W. Schultz. Responses of midbrain dopamine neurons to behavioral trigger stimuli in the monkey. , 1986, Journal of neurophysiology.

[28] E. Miller,et al. An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.

[29] K. Doya. Modulators of decision making , 2008, Nature Neuroscience.

[30] Thomas E. Hazy,et al. Neural mechanisms of acquired phasic dopamine responses in learning , 2010, Neuroscience & Biobehavioral Reviews.

[31] J. Wickens,et al. Computational models of the basal ganglia: from robots to membranes , 2004, Trends in Neurosciences.

[32] I. Gormezano,et al. Nictitating Membrane: Classical Conditioning and Extinction in the Albino Rabbit , 1962, Science.

[33] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[34] W. Schultz. Multiple dopamine functions at different time courses. , 2007, Annual review of neuroscience.

[35] Y. Niv. Reinforcement learning in the brain , 2009 .

[36] P. Dayan,et al. Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[37] P. Dayan,et al. A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[38] K. Berridge,et al. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? , 1998, Brain Research Reviews.

[39] P. Glimcher. Decisions, Uncertainty, and the Brain: The Science of Neuroeconomics , 2003 .

[40] J. Mayhew,et al. How Visual Stimuli Activate Dopaminergic Neurons at Short Latency , 2005, Science.

[41] J. Goodnow. Determinants of choice-distribution in two-choice situations. , 1955, The American journal of psychology.

[42] B. Skinner,et al. Principles of Behavior , 1944 .

[43] A M Graybiel,et al. The basal ganglia and adaptive motor control. , 1994, Science.

[44] K. Doya,et al. Understanding Neural Coding through the Model-Based Analysis of Decision Making , 2007, The Journal of Neuroscience.

[45] John P O'Doherty,et al. Model-based approaches to neuroimaging: combining reinforcement learning theory with fMRI data. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[46] D. Wolpert. Probabilistic models in human sensorimotor control. , 2007, Human movement science.

[47] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.

[48] H. Bergman,et al. The dynamics of dopamine in control of motor behavior , 2009, Current Opinion in Neurobiology.

[49] K. Doya,et al. Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[50] J. W. Moore,et al. Conditioned response timing and integration in the cerebellum. , 1997, Learning & memory.

[51] E. Miller,et al. Different time courses of learning-related activity in the prefrontal cortex and striatum , 2005, Nature.

[52] D. Joel,et al. The organization of the basal ganglia-thalamocortical circuits: Open interconnected rather than closed segregated , 1994, Neuroscience.

[53] M. Frank,et al. From reinforcement learning models to psychiatric and neurological disorders , 2011, Nature Neuroscience.

[54] B. Balleine,et al. Reward‐guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico‐basal ganglia networks , 2008, The European journal of neuroscience.

[55] P. Glimcher,et al. Value Representations in the Primate Striatum during Matching Behavior , 2008, Neuron.

[56] Garrett E. Alexander. Basal ganglia , 1998 .

[57] B. Balleine,et al. The integrative function of the basal ganglia in instrumental conditioning , 2009, Behavioural Brain Research.

[58] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[59] David S. Touretzky,et al. Representation and Timing in Theories of the Dopamine System , 2006, Neural Computation.

[60] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[61] K. Doya,et al. Multiple Representations of Belief States and Action Values in Corticobasal Ganglia Loops , 2007, Annals of the New York Academy of Sciences.

[62] E. Tolman. Cognitive maps in rats and men. , 1948, Psychological review.

[63] Benjamin O. Turner,et al. Cortical and basal ganglia contributions to habit learning and automaticity , 2010, Trends in Cognitive Sciences.

[64] K. Doya,et al. Validation of Decision-Making Models and Analysis of Decision Variables in the Rat Basal Ganglia , 2009, The Journal of Neuroscience.

[65] W. Schultz,et al. Influence of Reward Delays on Responses of Dopamine Neurons , 2008, The Journal of Neuroscience.

[66] Peter Dayan,et al. Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[67] Michael X. Cohen,et al. Neurocomputational models of basal ganglia function in learning, memory and choice , 2009, Behavioural Brain Research.

[68] J. D. Miller,et al. Mesencephalic dopaminergic unit activity in the behaviorally conditioned rat. , 1981, Life sciences.

[69] J. Doyon,et al. Contributions of the basal ganglia and functionally related brain structures to motor learning , 2009, Behavioural Brain Research.

[70] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[71] Jonathan D. Cohen,et al. Imaging valuation models in human choice. , 2006, Annual review of neuroscience.

[72] K. Berridge. The debate over dopamine’s role in reward: the case for incentive salience , 2007, Psychopharmacology.

[73] S. Haber,et al. Reward-Related Cortical Inputs Define a Large Striatal Region in Primates That Interface with Associative Cortical Connections, Providing a Substrate for Incentive-Based Learning , 2006, The Journal of Neuroscience.

[74] B. Balleine,et al. Motivational control of goal-directed action , 1994 .

[75] E. Thorndike. Animal Intelligence; Experimental Studies , 2009 .

[76] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[77] E. Kehoe,et al. Temporal primacy overrides prior training in serial compound conditioning of the rabbit’s nictitating membrane response , 1987 .

[78] Richard S. Sutton,et al. Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System , 2008, Neural Computation.

[79] H. Bergman,et al. Information processing, dimensionality reduction and reinforcement learning in the basal ganglia , 2003, Progress in Neurobiology.

[80] K. Doya,et al. The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.

[81] S. Haber. The primate basal ganglia: parallel and integrative networks , 2003, Journal of Chemical Neuroanatomy.

[82] J. Mink. THE BASAL GANGLIA: FOCUSED SELECTION AND INHIBITION OF COMPETING MOTOR PROGRAMS , 1996, Progress in Neurobiology.

[83] Kenji Doya,et al. Reinforcement learning: Computational theory and biological mechanisms , 2007, HFSP journal.

[84] M. Frank,et al. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. , 2006, Psychological review.

[85] Michael J. Frank,et al. By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.

[86] John B. Watson,et al. Behavior : An Introduction to Comparative Psychology , 2006 .

[87] B. Campbell,et al. Punishment and aversive behavior , 1969 .

[88] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[89] W. Schultz. Predictive reward signal of dopamine neurons. , 1998, Journal of neurophysiology.

[90] J. Wickens,et al. Striatal Contributions to Reward and Decision Making , 2007 .

[91] A. Dickinson,et al. Parallel and interactive learning processes within the basal ganglia: Relevance for the understanding of addiction , 2009, Behavioural Brain Research.

[92] Amy J. Tindell,et al. Ventral pallidal neurons code incentive motivation: amplification by mesolimbic sensitization and amphetamine , 2005, The European journal of neuroscience.

[93] Peter Redgrave,et al. A computational model of action selection in the basal ganglia. I. A new functional anatomy , 2001, Biological Cybernetics.

[94] Michael X. Cohen,et al. Neurocomputational mechanisms of reinforcement-guided learning in humans: A review , 2008, Cognitive, affective & behavioral neuroscience.

[95] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[96] R. Wise. Dopamine, learning and motivation , 2004, Nature Reviews Neuroscience.

[97] E. Tolman. There is more than one kind of learning. , 1949, Psychological review.

[98] Kevin N. Gurney,et al. The Basal Ganglia and Cortex Implement Optimal Decision Making Between Alternative Actions , 2007, Neural Computation.

[99] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[100] O. Hikosaka. Basal Ganglia Mechanisms of Reward‐Oriented Eye Movement , 2007, Annals of the New York Academy of Sciences.

[101] Eytan Ruppin,et al. Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.

[102] P. Dayan,et al. Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.

[103] Francesco Mannella,et al. The roles of the amygdala in the affective regulation of body, brain, and behaviour , 2010, Connect. Sci..

[104] P. Redgrave,et al. What is reinforced by phasic dopamine signals? , 2008, Brain Research Reviews.

[105] C. B. Ferster,et al. Schedules of reinforcement , 1957 .

[106] Stephen M. Kosslyn,et al. Memory and mind : a festschrift for Gordon H. Bower , 2007 .

[107] Edgar H Vogel,et al. Computational Theories of Classical Conditioning , 2002 .

[108] Michael X. Cohen,et al. Different neural systems adjust motor behavior in response to reward and punishment , 2007, NeuroImage.

[109] Angela J. Yu,et al. Uncertainty, Neuromodulation, and Attention , 2005, Neuron.

[110] P. Montague,et al. Neuroeconomic Approaches to Mental Disorders , 2010, Neuron.

[111] Ashvin Shah,et al. Functional mechanisms of motor skill acquisition , 2007, BMC Neuroscience.

[112] W. Schultz. Behavioral theories and the neurophysiology of reward. , 2006, Annual review of psychology.

[113] John M. Ennis,et al. A neurobiological theory of automaticity in perceptual categorization. , 2007, Psychological review.

[114] Scott T. Grafton,et al. Evidence for a distributed hierarchy of action representation in the brain. , 2007, Human movement science.

[115] David S. Touretzky,et al. Long-Term Reward Prediction in TD Models of the Dopamine System , 2002, Neural Computation.

[116] James L Olds,et al. Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. , 1954, Journal of comparative and physiological psychology.

[117] Sham M. Kakade,et al. Opponent interactions between serotonin and dopamine , 2002, Neural Networks.

[118] B. Balleine,et al. The Role of the Dorsal Striatum in Reward and Decision-Making , 2007, The Journal of Neuroscience.

[119] Ethan S. Bromberg-Martin,et al. Dopamine in Motivational Control: Rewarding, Aversive, and Alerting , 2010, Neuron.

[120] Kenji Doya,et al. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.

[121] Takeo Watanabe,et al. Temporally Extended Dopamine Responses to Perceptually Demanding Reward-Predictive Stimuli , 2010, The Journal of Neuroscience.

[122] K. Berridge,et al. Intra-Accumbens Amphetamine Increases the Conditioned Incentive Salience of Sucrose Reward: Enhancement of Reward “Wanting” without Enhanced “Liking” or Response Reinforcement , 2000, The Journal of Neuroscience.

[123] P. Dayan,et al. Dopamine, uncertainty and TD learning , 2005, Behavioral and Brain Functions.

[124] J. Hollerman,et al. Changes in behavior-related neuronal activity in the striatum during learning , 2003, Trends in Neurosciences.

[125] A. Graybiel. The basal ganglia: learning new tricks and loving it , 2005, Current Opinion in Neurobiology.

[126] G. E. Alexander,et al. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. , 1986, Annual review of neuroscience.

[127] J. Horvitz. Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events , 2000, Neuroscience.

[128] J. Hollerman,et al. Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.

[129] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[130] J. Wickens. Synaptic plasticity in the basal ganglia , 2009, Behavioural Brain Research.

[131] Alex Kacelnik,et al. State-dependent learning and suboptimal choice: when starlings prefer long over short delays to food , 2005, Animal Behaviour.

[132] B. Everitt,et al. Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex , 2002, Neuroscience & Biobehavioral Reviews.

[133] J. Tanji,et al. Activity in the Lateral Prefrontal Cortex Reflects Multiple Steps of Future Events in Action Plans , 2006, Neuron.

[134] Joel L. Davis,et al. A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[135] J. Tanji,et al. Role of the lateral prefrontal cortex in executive behavioral control. , 2008, Physiological reviews.

[136] B. Knowlton,et al. Learning and memory functions of the Basal Ganglia. , 2002, Annual review of neuroscience.

[137] R. Dolan,et al. Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans , 2006, Nature.

[138] W. F. Prokasy,et al. Classical conditioning II: Current research and theory. , 1972 .

[139] Adam Johnson,et al. Computing motivation: Incentive salience boosts of drug or appetite states , 2008, Behavioral and Brain Sciences.

[140] S. Siegel,et al. Decision making behavior in a two-choice uncertain outcome situation. , 2010, Journal of experimental psychology.

[141] Andrew Zisserman,et al. Advances in Neural Information Processing Systems (NIPS) , 2007 .

[142] Bernard Widrow,et al. Adaptive switching circuits , 1988 .

[143] P. Dayan,et al. Cortical substrates for exploratory decisions in humans , 2006, Nature.

[144] A. Dickinson. Actions and habits: the development of behavioural autonomy , 1985 .

[145] J. Wallis. Orbitofrontal cortex and its contribution to decision-making. , 2007, Annual review of neuroscience.

[146] J. W. Aldridge,et al. Dissecting components of reward: 'liking', 'wanting', and learning. , 2009, Current opinion in pharmacology.

[147] P. Goldman-Rakic. Cellular basis of working memory , 1995, Neuron.

[148] E. Vaadia,et al. Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[149] Mitsuo Kawato,et al. Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning , 2006, Neural Networks.

[150] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[151] A. Björklund,et al. Dopamine neuron systems in the brain: an update , 2007, Trends in Neurosciences.

[152] Michael J. Frank,et al. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.

[153] A. Barto,et al. Effect on movement selection of an evolving sensory representation: A multiple controller model of skill acquisition , 2009, Brain Research.

[154] Kyle S. Smith,et al. Corticostriatal Interactions during Learning, Memory Processing, and Decision Making , 2009, The Journal of Neuroscience.

[155] J. Gold,et al. The neural basis of decision making. , 2007, Annual review of neuroscience.

[156] Bradley B. Doll,et al. The basal ganglia in reward and decision making: computational models and empirical studies , 2009 .

[157] R. M. Elliott,et al. Behavior of Organisms , 1991 .

[158] A. Barto,et al. Adaptive Critics and the Basal Ganglia , 1994 .

[159] W. Schultz,et al. Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons , 2003, Science.

[160] John S. Edwards,et al. The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .

[161] R. Palmiter,et al. Reward without Dopamine , 2003, The Journal of Neuroscience.

[162] Sabrina M. Tom,et al. The Neural Correlates of Motor Skill Automaticity , 2005, The Journal of Neuroscience.

[163] P. Strick,et al. Basal-ganglia 'projections' to the prefrontal cortex of the primate. , 2002, Cerebral cortex.

[164] T. Prescott,et al. The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward. , 2010, Progress in Neurobiology.

[165] I. Pavlov. Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex , 1929 .

[166] P. Glimcher,et al. Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[167] R J HERRNSTEIN,et al. Relative and absolute strength of response as a function of frequency of reinforcement. , 1961, Journal of the experimental analysis of behavior.

[168] O. Hikosaka,et al. Dopamine Neurons Can Represent Context-Dependent Prediction Error , 2004, Neuron.

[169] R. Sutton,et al. Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element , 1982, Behavioural Brain Research.

[170] L. Green,et al. A discounting framework for choice with delayed and probabilistic rewards. , 2004, Psychological bulletin.

[171] A G Barto,et al. Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[172] Keiji Tanaka,et al. Neuronal Correlates of Goal-Based Motor Selection in the Prefrontal Cortex , 2003, Science.

[173] R. A. Gardner,et al. Multiple-choice decision-behavior. , 1958, The American journal of psychology.

[174] P. L. Brown,et al. Auto-shaping of the pigeon's key-peck. , 1968, Journal of the experimental analysis of behavior.

[175] P. Dayan,et al. Opinion TRENDS in Cognitive Sciences Vol.10 No.8 Full text provided by www.sciencedirect.com A normative perspective on motivation , 2022 .

[176] Roderic A. Grupen,et al. A Framework for the Development of Robot Behavior , 2005 .

[177] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.

[178] S. Nicola. The nucleus accumbens as part of a basal ganglia action selection circuit , 2007, Psychopharmacology.

[179] W. Schultz,et al. Coding of Predicted Reward Omission by Dopamine Neurons in a Conditioned Inhibition Paradigm , 2003, The Journal of Neuroscience.

[180] Ashvin Shah. Biologically-based functional mechanisms of motor skill acquisition , 2008 .

[181] Carol A. Seger,et al. Category learning in the brain. , 2010, Annual review of neuroscience.

[182] Paolo Calabresi,et al. Dopamine-mediated regulation of corticostriatal synaptic plasticity , 2007, Trends in Neurosciences.

[183] T. Stanford,et al. Subcortical loops through the basal ganglia , 2005, Trends in Neurosciences.

[184] M. Roesch,et al. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[185] W. Schultz,et al. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[186] J. W. Aldridge,et al. Coding of Serial Order by Neostriatal Neurons: A “Natural Action” Approach to Movement Sequence , 1998, The Journal of Neuroscience.

[187] R. Rescorla,et al. A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[188] T. Maia. Reinforcement learning, conditioning, and the brain: Successes and challenges , 2009, Cognitive, affective & behavioral neuroscience.

[189] L. Kamin. Predictability, surprise, attention, and conditioning , 1967 .

[190] Saumya Das,et al. Expression of the Alzheimer amyloid-promoting factor antichymotrypsin is induced in human astrocytes by IL-1 , 1995, Neuron.