论文信息 - Beyond simple model-free reinforcement learning in human decision making

Beyond simple model-free reinforcement learning in human decision making

Over the last two decades, there has been a large scale effort in cognitive neuroscience to understand learning and decision making from the perspective of simple model-free reinforcement learning algorithms. This interest was invigorated in the mid 1990’s, when it was realized that the phasic activity of midbrain dopaminergic neurons resembles reward prediction errors. The algorithms studied formalize the notion of learning from past experiences through trial and error. Although important, there are many aspects of behavior they cannot explain. More recent work has begun to fill in some of these gaps by borrowing yet additional ideas from computational reinforcement learning. One line of inquiry has concentrated on aligning goal-directed behavior, which resembles the common sense notion of “planning”, with model-based reinforcement learning. This work has aimed to understand how the brain is able to learn the world model prescribed by the model-based framework, and to characterize the neural correlates of the value functions it predicts. This thesis adds to this work by offering two separate, but related, algorithmic accounts of how the brain may be able to actually map the world model into a decision. Existing data are examined and new experiments are performed. A second line of inquiry has concentrated on understanding behavior from the perspective of hierarchical reinforcement learning. The thesis makes two contributions to this area as well. First, it is shown that the brain codes pseudo-reward prediction errors, a prediction error in response to a faux reward signal that is used to train skills that are not in themselves useful, but that may be used to achieve other means. Second, an optimality framework is provided for understanding which skills are most beneficial to have when confronted with an ensemble of tasks.

Alec Solway | Alec Solway

[1] H. Helmholtz. Handbuch der physiologischen Optik , 2015 .

[2] Miguel A. Vadillo,et al. Illusion of Control , 2013, Experimental psychology.

[3] Shinsuke Shimojo,et al. Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.

[4] A. Rangel,et al. The Computation of Stimulus Values in Simple Choice , 2014 .

[5] Nathaniel D. Daw,et al. Cortical and Hippocampal Correlates of Deliberation during Model-Based Decisions for Rewards in Humans , 2013, PLoS Comput. Biol..

[6] C. Koch,et al. Simultaneous modeling of visual saliency and value computation improves predictions of economic choice , 2013, Proceedings of the National Academy of Sciences.

[7] A. Pouget,et al. Probabilistic brains: knowns and unknowns , 2013, Nature Neuroscience.

[8] Joseph T. McGuire,et al. Neural and Behavioral Evidence for an Intrinsic Cost of Self-Control , 2013, PloS one.

[9] Daniel Polani,et al. Informational Constraints-Driven Organization in Goal-Directed Behavior , 2013, Adv. Complex Syst..

[10] Wouter Kool,et al. Neural Representation of Reward Probability: Evidence from the Illusion of Control , 2013, Journal of Cognitive Neuroscience.

[11] Luca Scrucca,et al. GA: A Package for Genetic Algorithms in R , 2013 .

[12] M. Botvinick,et al. Neural representations of events arise from temporal community structure , 2013, Nature Neuroscience.

[13] Carlos Diuk,et al. Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia , 2013, The Journal of Neuroscience.

[14] Andrew G. Barto,et al. Behavioral Hierarchy: Exploration and Representation , 2013, Computational and Robotic Models of the Hierarchical Organization of Behavior.

[15] D. Shohamy,et al. Preference by Association: How Memory Mechanisms in the Hippocampus Bias Decisions , 2012, Science.

[16] Charles Kemp,et al. Exploring the conceptual universe. , 2012, Psychological review.

[17] J. Rieskamp,et al. Deciding When to Decide: Time-Variant Sequential Sampling Models Explain the Emergence of Value-Based Decisions in the Human Brain , 2012, The Journal of Neuroscience.

[18] John E. Laird,et al. The Soar Cognitive Architecture , 2012 .

[19] N. Daw,et al. Dissociating hippocampal and striatal contributions to sequential prediction learning , 2012, The European journal of neuroscience.

[20] F. Mathy,et al. What’s magic about magic numbers? Chunking and data compression in short-term memory , 2012, Cognition.

[21] Peter Dayan,et al. Bonsai Trees in Your Head: How the Pavlovian System Sculpts Goal-Directed Choices by Pruning Decision Trees , 2012, PLoS Comput. Biol..

[22] M. Frank,et al. Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI. , 2012, Cerebral cortex.

[23] P. Dayan,et al. Mapping value based planning and extensively trained choice in the human brain , 2012, Nature Neuroscience.

[24] T. Shallice,et al. The Organisation of Mind , 2011, Cortex.

[25] Alireza Khadivi,et al. Automatic skill acquisition in reinforcement learning using graph centrality measures , 2012, Intell. Data Anal..

[26] Alec Solway,et al. Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates. , 2012, Psychological review.

[27] Nathaniel D. Daw,et al. Environmental statistics and the trade-off between model-based and TD learning in humans , 2011, NIPS.

[28] Jeffrey M. Zacks,et al. Prediction Error Associated with the Perceptual Segmentation of Naturalistic Events , 2011, Journal of Cognitive Neuroscience.

[29] Colin Camerer,et al. Transformation of stimulus value signals into motor commands during simple choice , 2011, Proceedings of the National Academy of Sciences.

[30] Benoît Lemaire,et al. MDLChunker: A MDL-Based Cognitive Model of Inductive Learning , 2011, Cogn. Sci..

[31] Terry Lohrenz,et al. Sub-Second Dopamine Detection in Human Striatum , 2011, PloS one.

[32] A. Rangel,et al. Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions , 2011, Proceedings of the National Academy of Sciences.

[33] B. Love,et al. The myth of computational level theory and the vacuity of rational analysis , 2011, Behavioral and Brain Sciences.

[34] Joseph T. McGuire,et al. A Neural Signature of Hierarchical Reinforcement Learning , 2011, Neuron.

[35] C. Padoa-Schioppa. Neurobiology of economic choice: a good-based model. , 2011, Annual review of neuroscience.

[36] Daniel Polani,et al. Grounding subgoals in information transitions , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[37] Dylan A. Simon,et al. Neural Correlates of Forward Planning in a Spatial Decision Task in Humans , 2011, The Journal of Neuroscience.

[38] N. Daw,et al. Signals in Human Striatum Are Appropriate for Policy Update Rather than Value Prediction , 2011, The Journal of Neuroscience.

[39] Clay B. Holroyd,et al. Dissociated roles of the anterior cingulate cortex in reward and conflict processing as revealed by the feedback error-related negativity and N200 , 2011, Biological Psychology.

[40] Catherine Stamoulis,et al. Advance cueing produces enhanced action-boundary patterns of spike activity in the sensorimotor striatum. , 2011, Journal of neurophysiology.

[41] P. Glimcher. Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis , 2011, Proceedings of the National Academy of Sciences.

[42] T. Kornienko. A Cognitive Basis for Context-Dependent Utility , 2011 .

[43] Jochen Ditterich,et al. A Comparison between Mechanisms of Multi-Alternative Perceptual Decision Making: Ability to Explain Human Behavior, Predictions for Neurophysiology, and Relationship with Decision Theory , 2010, Front. Neurosci..

[44] Gustavo Deco,et al. Choice, difficulty, and confidence in the brain , 2010, NeuroImage.

[45] N. Chater,et al. Preference reversal in multiattribute choice. , 2010, Psychological review.

[46] Christof Koch,et al. The Drift Diffusion Model Can Account for the Accuracy and Reaction Time of Value-Based Choices Under High and Low Time Pressure , 2010, Judgment and Decision Making.

[47] C S Green,et al. Alterations in choice behavior by manipulations of world model , 2010, Proceedings of the National Academy of Sciences.

[48] Xin Jin,et al. Start/stop signals emerge in nigrostriatal circuits during sequence learning , 2010, Nature.

[49] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[50] J. O'Doherty,et al. Human Medial Orbitofrontal Cortex Is Recruited during Experience of Imagined and Real Rewards Prescan Training , 2022 .

[51] W. Schultz. Dopamine signals for reward value and risk: basic and recent data , 2010, Behavioral and Brain Functions.

[52] Antonio Rangel,et al. Neural computations associated with goal-directed choice , 2010, Current Opinion in Neurobiology.

[53] Y. Niv,et al. Learning latent structure: carving nature at its joints , 2010, Current Opinion in Neurobiology.

[54] P. Read Montague,et al. Human Neuroscience , 2022 .

[55] M. Roesch,et al. Neural Correlates of Stimulus–Response and Response–Outcome Associations in Dorsolateral Versus Dorsomedial Striatum , 2010, Front. Integr. Neurosci..

[56] P. Tobler,et al. Neural Signatures of Intransitive Preferences , 2010, Front. Hum. Neurosci..

[57] W. Schultz,et al. Adaptation of Reward Sensitivity in Orbitofrontal Neurons , 2010, The Journal of Neuroscience.

[58] A. Rangel,et al. Visual fixations and the computation and comparison of value in simple choice. , 2010, Nature neuroscience.

[59] F. Christian. How the Brain Integrates Costs and Benefits During Decision Making , 2010 .

[60] Sara Finley,et al. Morpheme Segmentation from Distributional Information , 2010 .

[61] T. Maia. Reinforcement learning, conditioning, and the brain: Successes and challenges , 2009, Cognitive, affective & behavioral neuroscience.

[62] N. Daw,et al. Reinforcement learning and higher level cognition: Introduction to special issue , 2009, Cognition.

[63] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[64] Jeremy R. Reynolds,et al. Developing PFC representations using reinforcement learning , 2009, Cognition.

[65] Jung Hoon Sul,et al. Role of Striatum in Updating Values of Chosen Actions , 2009, The Journal of Neuroscience.

[66] C. Padoa-Schioppa. Range-Adapting Representation of Economic Value in the Orbitofrontal Cortex , 2009, The Journal of Neuroscience.

[67] Shimon Ullman,et al. Cortical Circuitry Implementing Graphical Models , 2009, Neural Computation.

[68] Timothy F. Brady,et al. Compression in visual working memory: using statistical regularities to form more efficient memory representations. , 2009, Journal of experimental psychology. General.

[69] M. Kimura,et al. Neuronal encoding of reward value and direction of actions in the primate putamen. , 2009, Journal of neurophysiology.

[70] Matthew M Botvinick,et al. Empirical and computational support for context-dependent representations of serial order: reply to Bowers, Damian, and Davis (2009). , 2009, Psychological review.

[71] J. Feldman,et al. Bayes and the Simplicity Principle in Perception Simplicity versus Likelihood Principles in Perception , 2022 .

[72] Chrystopher L. Nehaniv,et al. Hierarchical Behaviours: Getting the Most Bang for Your Bit , 2009, ECAL.

[73] K. Christoff,et al. Prefrontal organization of cognitive control according to levels of abstraction , 2009, Brain Research.

[74] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .

[75] E. Thorndike. Animal Intelligence; Experimental Studies , 2009 .

[76] Karl J. Friston,et al. Reinforcement Learning or Active Inference? , 2009, PloS one.

[77] E. Koechlin,et al. Motivation and cognitive control in the human prefrontal cortex , 2009, Nature Neuroscience.

[78] B. Balleine,et al. Evidence of Action Sequence Chunking in Goal-Directed Instrumental Conditioning and Its Dependence on the Dorsomedial Prefrontal Cortex , 2009, The Journal of Neuroscience.

[79] Y. Niv. Reinforcement learning in the brain , 2009 .

[80] Angela J. Yu,et al. Dynamics of attentional selection under conflict: toward a rational Bayesian account. , 2009, Journal of experimental psychology. Human perception and performance.

[81] Neil Stewart. EPS Prize Lecture: Decision by sampling: The role of the decision environment in risky choice , 2009, Quarterly journal of experimental psychology.

[82] Nando de Freitas,et al. An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Reward , 2009, AISTATS.

[83] Peter Dayan,et al. Goal-directed control and its antipodes , 2009, Neural Networks.

[84] Joseph T. McGuire,et al. Effort discounting in human nucleus accumbens , 2009, Cognitive, affective & behavioral neuroscience.

[85] Jeffrey W. Cooney,et al. Hierarchical cognitive control deficits following damage to the human frontal lobe , 2009, Nature Neuroscience.

[86] Colin Camerer,et al. Neural Response to Reward Anticipation under Risk Is Nonlinear in Probabilities , 2009, The Journal of Neuroscience.

[87] Markus Ullsperger,et al. Neuropharmacology of performance monitoring , 2009, Neuroscience & Biobehavioral Reviews.

[88] P. Glimcher. Choice: Towards a Standard Back-pocket Model , 2009 .

[89] Rangel Antonio. The neuroeconomics of simple goal-directed choices , 2009 .

[90] B. Levine. Causal models. , 2009, Epidemiology.

[91] A. Rangel. The Computation and Comparison of Value in Goal-directed Choice , 2009 .

[92] Michael L. Littman,et al. Hierarchical Reinforcement Learning , 2009, Encyclopedia of Artificial Intelligence.

[93] C. Lebiere,et al. Applying Cognitive Architectures to Decision-Making: How Cognitive Theory and the Equivalence Measure Triumphed in the Technion Prediction Tournament , 2009 .

[94] D. Barber,et al. Solving deterministic policy ( PO ) MDPs using Expectation-Maximisation and Antifreeze , 2009 .

[95] Pernille Hemmer,et al. A Bayesian Account of Reconstructive Memory , 2009, Top. Cogn. Sci..

[96] R. Poldrack,et al. Prospect Theory and the Brain , 2009 .

[97] Matthew Botvinick,et al. Goal-directed decision making in prefrontal cortex: a computational framework , 2008, NIPS.

[98] Andrew G. Barto,et al. Skill Characterization Based on Betweenness , 2008, NIPS.

[99] Thomas L. Griffiths,et al. Modeling the effects of memory on human online sentence processing with particle filters , 2008, NIPS.

[100] Robert D. Nowak,et al. Human Active Learning , 2008, NIPS.

[101] Hamid Beigy,et al. Automatic Discovery of Subgoals in Reinforcement Learning Using Strongly Connected Components , 2008, ICONIP.

[102] Gráinne M. Fitzsimons,et al. The Selfish Goal: Unintended Consequences of Intended Goal Pursuits. , 2008, Social cognition.

[103] J. Kruschke. Bayesian approaches to associative learning: From passive to active learning , 2008, Learning & behavior.

[104] Marc Toussaint,et al. Hierarchical POMDP Controller Optimization by Likelihood Maximization , 2008, UAI.

[105] Colin Camerer,et al. A framework for studying the neurobiology of value-based decision making , 2008, Nature Reviews Neuroscience.

[106] B. Balleine,et al. Calculating Consequences: Brain Systems That Encode the Causal Effects of Actions , 2008, The Journal of Neuroscience.

[107] K. Sakai. Task set and prefrontal cortex. , 2008, Annual review of neuroscience.

[108] W. Richards,et al. Perception as Bayesian Inference , 2008 .

[109] Colin Camerer,et al. Dissociating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors , 2008, The Journal of Neuroscience.

[110] P. Glimcher,et al. Value Representations in the Primate Striatum during Matching Behavior , 2008, Neuron.

[111] M. Corbetta,et al. The Reorienting System of the Human Brain: From Environment to Theory of Mind , 2008, Neuron.

[112] M. Botvinick. Hierarchical models of behavior and prefrontal function , 2008, Trends in Cognitive Sciences.

[113] Scott T. Grafton,et al. Action outcomes are represented in human inferior frontoparietal cortex. , 2008, Cerebral cortex.

[114] David Badre,et al. Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes , 2008, Trends in Cognitive Sciences.

[115] Joelle Pineau,et al. Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..

[116] P. Dayan,et al. Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[117] N. Chater,et al. The probabilistic mind: prospects for Bayesian cognitive science , 2008 .

[118] Daniel M. Oppenheimer,et al. Heuristics made easy: an effort-reduction framework. , 2008, Psychological bulletin.

[119] Samuel M. McClure,et al. BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area , 2008, Science.

[120] Roger Ratcliff,et al. The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks , 2008, Neural Computation.

[121] Richard N Aslin,et al. Bayesian learning of visual chunks by human observers , 2008, Proceedings of the National Academy of Sciences.

[122] J. Russell,et al. The control of instrumental action following outcome devaluation in young children aged between 1 and 4 years. , 2008, Journal of experimental psychology. General.

[123] Ulrik Brandes,et al. On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[124] Philip Holmes,et al. A Neural Network Model of the Eriksen Task: Reduction, Analysis, and Data Fitting , 2008, Neural Computation.

[125] Brian Knutson,et al. Valence and salience contribute to nucleus accumbens activation , 2008, NeuroImage.

[126] Martin Rosvall,et al. Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[127] P. Dayan,et al. Neuronal Correlates of Decision Making , 2008 .

[128] A. Rustichini. Neuroeconomics: Formal models of decision making and cognitive neuroscience , 2008 .

[129] Timothy D. Hanks,et al. Neurobiology of decision making: An intentional framework , 2008 .

[130] Sophie Denève,et al. Bayesian Spiking Neurons I: Inference , 2008, Neural Computation.

[131] J. Tanji,et al. Role of the lateral prefrontal cortex in executive behavioral control. , 2008, Physiological reviews.

[132] J. Tanji,et al. Concept-based behavioral planning and the lateral prefrontal cortex , 2007, Trends in Cognitive Sciences.

[133] Matthijs A. A. van der Meer,et al. Integrating hippocampus and striatum in decision-making , 2007, Current Opinion in Neurobiology.

[134] M. Roesch,et al. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.

[135] David Badre,et al. Functional Magnetic Resonance Imaging Evidence for a Hierarchical Organization of the Prefrontal Cortex , 2007, Journal of Cognitive Neuroscience.

[136] Adam Johnson,et al. Neural Ensembles in CA3 Transiently Encode Paths Forward of the Animal at a Decision Point , 2007, The Journal of Neuroscience.

[137] J. Wallis,et al. Neuroscience of Rule-Guided Behavior , 2007 .

[138] E. Koechlin,et al. Anterior Prefrontal Function and the Limits of Human Decision-Making , 2007, Science.

[139] G. Buzsáki,et al. Forward and reverse hippocampal place-cell sequences during ripples , 2007, Nature Neuroscience.

[140] David T. Neal,et al. A new look at habits and the habit-goal interface. , 2007, Psychological review.

[141] Marius Usher,et al. Extending a biologically inspired model of choice: multi-alternatives, nonlinearity and value-based multidimensional choice , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[142] J. O'Doherty,et al. Orbitofrontal Cortex Encodes Willingness to Pay in Everyday Economic Transactions , 2007, The Journal of Neuroscience.

[143] D. Schacter,et al. Remembering the past to imagine the future: the prospective brain , 2007, Nature Reviews Neuroscience.

[144] Timothy E. J. Behrens,et al. Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[145] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[146] J. Gold,et al. The neural basis of decision making. , 2007, Annual review of neuroscience.

[147] O. Hikosaka,et al. Lateral habenula as a source of negative reward signals in dopamine neurons , 2007, Nature.

[148] B. Balleine,et al. Orbitofrontal Cortex Mediates Outcome Encoding in Pavlovian But Not Instrumental Conditioning , 2007, The Journal of Neuroscience.

[149] P. Dayan,et al. Differential Encoding of Losses and Gains in the Human Striatum , 2007, The Journal of Neuroscience.

[150] Alexandre Pouget,et al. Exact Inferences in a Neural Implementation of a Hidden Markov Model , 2007, Neural Computation.

[151] K. Doya,et al. Multiple Representations of Belief States and Action Values in Corticobasal Ganglia Loops , 2007, Annals of the New York Academy of Sciences.

[152] Karl J. Friston,et al. The mirror-neuron system: a Bayesian perspective. , 2007, Neuroreport.

[153] J. O'Doherty,et al. Model‐Based fMRI and Its Application to Reward Learning and Decision Making , 2007, Annals of the New York Academy of Sciences.

[154] Vivian V. Valentin,et al. Determining the Neural Substrates of Goal-Directed Learning in the Human Brain , 2007, The Journal of Neuroscience.

[155] Wolfgang Prinz,et al. Prospective coding in event representation , 2007, Cognitive Processing.

[156] J. Tenenbaum,et al. Word learning as Bayesian inference. , 2007, Psychological review.

[157] Arild Hestvik,et al. Brain responses to filled gaps , 2007, Brain and Language.

[158] R. Buckner,et al. Opinion TRENDS in Cognitive Sciences Vol.11 No.2 Self-projection and the brain , 2022 .

[159] E. Bézard,et al. Shaping of Motor Responses by Incentive Values through the Basal Ganglia , 2007, The Journal of Neuroscience.

[160] D. Hassabis,et al. Patients with hippocampal amnesia cannot imagine new experiences , 2007, Proceedings of the National Academy of Sciences.

[161] Marty G Woldorff,et al. Timing and Sequence of Brain Activity in Top-Down Control of Visual-Spatial Attention , 2007, PLoS biology.

[162] A. Tversky,et al. Prospect theory: an analysis of decision under risk — Source link , 2007 .

[163] A. Gopnik,et al. Causal learning : psychology, philosophy, and computation , 2007 .

[164] G. Csibra,et al. 'Obsessed with goals': functions and mechanisms of teleological interpretation of actions in humans. , 2007, Acta psychologica.

[165] J. Tenenbaum,et al. Intuitive theories as grammars for causal inference , 2007 .

[166] Rajesh P. N. Rao,et al. Imitation and Social Learning in Robots, Humans and Animals: A Bayesian model of imitation in infants and robots , 2007 .

[167] K. Berridge. The debate over dopamine’s role in reward: the case for incentive salience , 2007, Psychopharmacology.

[168] J. Salamone,et al. Effort-related functions of nucleus accumbens dopamine and associated forebrain circuits , 2007, Psychopharmacology.

[169] Clay B. Holroyd,et al. Evidence for hierarchical error processing in the human brain , 2006, Neuroscience.

[170] Rajesh P. N. Rao,et al. Bayesian brain : probabilistic approaches to neural coding , 2006 .

[171] A. Whiten,et al. Imitation of hierarchical action structure by young children. , 2006, Developmental science.

[172] Wei Ji Ma,et al. Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[173] Jonathan D. Cohen,et al. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. , 2006, Psychological review.

[174] Mitsuo Kawato,et al. Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning , 2006, Neural Networks.

[175] D. Plaut,et al. Such stuff as habits are made on: A reply to Cooper and Shallice (2006). , 2006 .

[176] Rajesh P. N. Rao,et al. Planning and Acting in Uncertain Environments using Probabilistic Inference , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[177] Matthew F. S. Rushworth,et al. Weighing up the benefits of work: Behavioral and neural analyses of effort-related decision making , 2006, Neural Networks.

[178] T. Shallice,et al. Hierarchical schemas and goals in the control of sequential behavior. , 2006, Psychological review.

[179] J. Gläscher,et al. Dissociable Systems for Gain- and Loss-Related Value Predictions and Errors of Prediction in the Human Brain , 2006, The Journal of Neuroscience.

[180] M. Walton,et al. Separate neural pathways process different decision costs , 2006, Nature Neuroscience.

[181] S. Haber,et al. Reward-Related Cortical Inputs Define a Large Striatal Region in Primates That Interface with Associative Cortical Connections, Providing a Substrate for Incentive-Based Learning , 2006, The Journal of Neuroscience.

[182] P. Dayan,et al. Opinion TRENDS in Cognitive Sciences Vol.10 No.8 Full text provided by www.sciencedirect.com A normative perspective on motivation , 2022 .

[183] Gordon D. A. Brown,et al. Decision by sampling , 2006, Cognitive Psychology.

[184] E. Vaadia,et al. Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.

[185] E. Rolls. Brain mechanisms underlying flavour and appetite , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[186] Konrad Paul Kording,et al. Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .

[187] A. Yuille,et al. Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[188] Christopher D. Manning,et al. Probabilistic models of language processing and acquisition , 2006, Trends in Cognitive Sciences.

[189] J. Tenenbaum,et al. Special issue on “Probabilistic models of cognition , 2022 .

[190] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[191] Kristina M. Visscher,et al. A Core System for the Implementation of Task Sets , 2006, Neuron.

[192] A. Owen,et al. Planning and problem solving: From neuropsychology to functional neuroimaging , 2006, Journal of Physiology-Paris.

[193] H. Yin,et al. The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.

[194] J. Tanji,et al. Activity in the Lateral Prefrontal Cortex Reflects Multiple Steps of Future Events in Action Plans , 2006, Neuron.

[195] C. Padoa-Schioppa,et al. Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.

[196] M. Frank,et al. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. , 2006, Psychological review.

[197] Michael R. Waldmann,et al. Causal Reasoning in Rats , 2006, Science.

[198] A. Roberts,et al. Primate orbitofrontal cortex and adaptive behaviour , 2006, Trends in Cognitive Sciences.

[199] M. Petrides,et al. Functional role of the basal ganglia in the planning and execution of actions , 2006, Annals of neurology.

[200] C. Padoa-Schioppa,et al. Multi-stage mental process for economic choice in capuchins , 2006, Cognition.

[201] R. Poldrack. Can cognitive processes be inferred from neuroimaging data? , 2006, Trends in Cognitive Sciences.

[202] A. Mikami,et al. Prefrontal activity during serial probe reproduction task: encoding, mnemonic, and retrieval processes. , 2006, Journal of neurophysiology.

[203] Scott T. Grafton,et al. Goal Representation in Human Anterior Intraparietal Sulcus , 2006, The Journal of Neuroscience.

[204] J. O'Doherty,et al. Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum , 2006, Neuron.

[205] Rajesh P. N. Rao. Neural Models of Bayesian Belief Propagation , 2006 .

[206] B. Balleine. Neural bases of food-seeking: Affect, arousal and reward in corticostriatolimbic circuits , 2005, Physiology & Behavior.

[207] Rajesh P. N. Rao,et al. Goal-Based Imitation as Probabilistic Inference over Graphical Models , 2005, NIPS.

[208] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[209] M. Hasselmo,et al. An integrate-and-fire model of prefrontal cortex neuronal activity during performance of goal-directed decision making. , 2005, Cerebral cortex.

[210] K. Doya,et al. Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[211] M. Brass,et al. Neural Circuitry Underlying Rule Use in Humans and Nonhuman Primates , 2005, The Journal of Neuroscience.

[212] M. Kringelbach. The human orbitofrontal cortex: linking reward to hedonic experience , 2005, Nature Reviews Neuroscience.

[213] B. Balleine,et al. Lesions of Medial Prefrontal Cortex Disrupt the Acquisition But Not the Expression of Goal-Directed Learning , 2005, The Journal of Neuroscience.

[214] Kip Smith,et al. A brain imaging study of the choice procedure , 2005, Games Econ. Behav..

[215] John R. Anderson,et al. Tracing Problem Solving in Real Time: fMRI Analysis of the Subject-paced Tower of Hanoi , 2005, Journal of Cognitive Neuroscience.

[216] P. Glimcher,et al. Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[217] B. Balleine,et al. The role of the dorsomedial striatum in instrumental conditioning , 2005, The European journal of neuroscience.

[218] B. Balleine,et al. Blockade of NMDA receptors in the dorsomedial striatum prevents action–outcome learning in instrumental conditioning , 2005, The European journal of neuroscience.

[219] Michael E. Hasselmo,et al. A Model of Prefrontal Cortical Mechanisms for Goal-directed Behavior , 2005, Journal of Cognitive Neuroscience.

[220] Clay B. Holroyd,et al. Knowing good from bad: differential activation of human cortical areas by positive and negative outcomes , 2005, The European journal of neuroscience.

[221] Jean-Arcady Meyer,et al. Integration of Navigation and Action Selection Functionalities in a Computational Model of Cortico-Basal-Ganglia–Thalamo-Cortical Loops , 2005, Adapt. Behav..

[222] Matthew T. Kaufman,et al. Distributed Neural Representation of Expected Value , 2005, The Journal of Neuroscience.

[223] Clay B. Holroyd,et al. ERP correlates of feedback and reward processing in the presence and absence of response choice. , 2005, Cerebral cortex.

[224] Karl J. Friston,et al. A theory of cortical responses , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[225] John E. Laird,et al. Soar-RL: integrating reinforcement learning with Soar , 2005, Cognitive Systems Research.

[226] P. Holland,et al. Orbitofrontal lesions impair use of cue-outcome associations in a devaluation task. , 2005, Behavioral neuroscience.

[227] B. Balleine,et al. Double Dissociation of Basolateral and Central Amygdala Lesions on the General and Outcome-Specific Forms of Pavlovian-Instrumental Transfer , 2005, The Journal of Neuroscience.

[228] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[229] K. Holyoak,et al. The Cambridge handbook of thinking and reasoning , 2005 .

[230] S. Bunge. How we use rules to select actions: A review of evidence from cognitive neuroscience , 2004, Cognitive, affective & behavioral neuroscience.

[231] Rajesh P. N. Rao. Hierarchical Bayesian Inference in Networks of Spiking Neurons , 2004, NIPS.

[232] D. Knill,et al. The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[233] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[234] Cameron S. Carter,et al. Errors without conflict: Implications for performance monitoring theories of anterior cingulate cortex , 2004, Brain and Cognition.

[235] P. Glimcher,et al. Activity in Posterior Parietal Cortex Is Correlated with the Relative Subjective Desirability of Action , 2004, Neuron.

[236] Jonathan D. Cohen,et al. The neural basis of error detection: conflict monitoring and the error-related negativity. , 2004, Psychological review.

[237] John R Anderson,et al. An integrated theory of the mind. , 2004, Psychological review.

[238] M. Walton,et al. Action sets and decisions in the medial frontal cortex , 2004, Trends in Cognitive Sciences.

[239] E. Murray,et al. Bilateral Orbital Prefrontal Cortex Lesions in Rhesus Monkeys Disrupt Choices Guided by Both Reward Value and Reward Contingency , 2004, The Journal of Neuroscience.

[240] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[241] Holger G. Krapp,et al. Multiplication and stimulus invariance in a looming-sensitive neuron , 2004, Journal of Physiology-Paris.

[242] W. Newsome,et al. Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[243] Bartlett W. Mel,et al. Computational subunits in thin dendrites of pyramidal cells , 2004, Nature Neuroscience.

[244] S. Kapur,et al. A Model of Antipsychotic Action in Conditioned Avoidance: A Computational Approach , 2004, Neuropsychopharmacology.

[245] Clay B. Holroyd,et al. Dorsal anterior cingulate cortex shows fMRI response to internal and external error signals , 2004, Nature Neuroscience.

[246] R. Shiffrin,et al. A model for evidence accumulation in the lexical decision task , 2004, Cognitive Psychology.

[247] J. Fuster. Upper processing stages of the perception–action cycle , 2004, Trends in Cognitive Sciences.

[248] P. Holland,et al. Amygdala–frontal interactions and reward expectancy , 2004, Current Opinion in Neurobiology.

[249] Keiji Tanaka,et al. The role of the medial prefrontal cortex in achieving goals , 2004, Current Opinion in Neurobiology.

[250] A. Yuille,et al. Object perception as Bayesian inference. , 2004, Annual review of psychology.

[251] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.

[252] E. Rolls. The functions of the orbitofrontal cortex , 1999, Brain and Cognition.

[253] Nestor A. Schmajuk,et al. Purposive behavior and cognitive mapping: a neural network model , 1992, Biological Cybernetics.

[254] Gary D. Bernard,et al. A proposed mechanism for multiplication of neural signals , 1976, Biological Cybernetics.

[255] P. Dayan. The Convergence of TD(λ) for General λ , 1992, Machine Learning.

[256] J. Joseph,et al. Prefrontal cortex and spatial sequencing in macaque monkey , 2004, Experimental Brain Research.

[257] J. Tanji,et al. Integration of temporal order and object information in the monkey lateral prefrontal cortex. , 2004, Journal of neurophysiology.

[258] David M. Sobel,et al. A theory of causal learning in children: causal maps and Bayes nets. , 2004, Psychological review.

[259] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[260] Clay B. Holroyd,et al. Errors in reward prediction are re£ected in the event-related brain potential , 2003 .

[261] Hanspeter A. Mallot,et al. 'Fine-to-Coarse' Route Planning and Navigation in Regionalized Environments , 2003, Spatial Cogn. Comput..

[262] B. Balleine,et al. The role of prelimbic cortex in instrumental conditioning , 2003, Behavioural Brain Research.

[263] B. Kolb,et al. Do rats have a prefrontal cortex? , 2003, Behavioural Brain Research.

[264] S. Killcross,et al. Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats , 2003, Behavioural Brain Research.

[265] R. Zemel,et al. Inference and computation with population codes. , 2003, Annual review of neuroscience.

[266] E. Koechlin,et al. The Architecture of Cognitive Control in the Human Prefrontal Cortex , 2003, Science.

[267] J. Parkinson,et al. Dissociable Contributions of the Human Amygdala and Orbitofrontal Cortex to Incentive Motivation and Goal Selection , 2003, The Journal of Neuroscience.

[268] E. Rolls,et al. Human cortical responses to water in the mouth, and the effects of thirst. , 2003, Journal of neurophysiology.

[269] A. Graybiel,et al. Representation of Action Sequence Boundaries by Macaque Prefrontal Cortical Neurons , 2003, Science.

[270] G. Schoenbaum,et al. Encoding Predicted Outcome and Acquired Value in Orbitofrontal Cortex during Cue Sampling Depends upon Input from Basolateral Amygdala , 2003, Neuron.

[271] J. O'Doherty,et al. Encoding Predictive Reward Value in Human Amygdala and Orbitofrontal Cortex , 2003, Science.

[272] Keiji Tanaka,et al. Neuronal Correlates of Goal-Based Motor Selection in the Prefrontal Cortex , 2003, Science.

[273] G. Csibra,et al. Teleological reasoning in infancy: the naı̈ve theory of rational action , 2003, Trends in Cognitive Sciences.

[274] Tai Sing Lee,et al. Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[275] D. V. von Cramon,et al. Error Monitoring Using External Feedback: Specific Roles of the Habenular Complex, the Reward System, and the Cingulate Motor Area Revealed by Functional Magnetic Resonance Imaging , 2003, The Journal of Neuroscience.

[276] Joshua B. Tenenbaum,et al. Inferring causal networks from observations and interventions , 2003, Cogn. Sci..

[277] Colin Camerer. Behavioural studies of strategic thinking in games , 2003, Trends in Cognitive Sciences.

[278] Karl J. Friston,et al. Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[279] Samuel M. McClure,et al. Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[280] S. Killcross,et al. Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.

[281] J. Saffran,et al. From Syllables to Syntax: Multilevel Statistical Learning by 12-Month-Old Infants , 2003 .

[282] K. Doya,et al. A unifying computational framework for motor control and social interaction. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[283] N. Chater,et al. Simplicity: a unifying principle in cognitive science? , 2003, Trends in Cognitive Sciences.

[284] B. Balleine,et al. The Effect of Lesions of the Basolateral Amygdala on Instrumental Conditioning , 2003, The Journal of Neuroscience.

[285] Hagai Attias,et al. Planning by Probabilistic Inference , 2003, AISTATS.

[286] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[287] R. Passingham,et al. Prefrontal interactions reflect future task operations , 2003, Nature Neuroscience.

[288] B. Balleine,et al. Sensitivity to Instrumental Contingency Degradation Is Mediated by the Entorhinal Cortex and Its Efferents via the Dorsal Hippocampus , 2002, The Journal of Neuroscience.

[289] P. Montague,et al. Neural Economics and the Biological Substrates of Valuation , 2002, Neuron.

[290] Clay B. Holroyd,et al. The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. , 2002, Psychological review.

[291] N. Schmajuk,et al. Latent learning, shortcuts and detours: a computational model , 2002, Behavioural Processes.

[292] E. Murray,et al. The amygdala and reward , 2002, Nature Reviews Neuroscience.

[293] A. Diederich,et al. Survey of decision field theory , 2002, Math. Soc. Sci..

[294] Eytan Ruppin,et al. Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.

[295] Philippe Gaussier,et al. From view cells and place cells to cognitive map learning: processing stages of the hippocampal system , 2002, Biological Cybernetics.

[296] W. Geisler. Ideal Observer Analysis , 2002 .

[297] C. Atance,et al. Episodic future thinking , 2001, Trends in Cognitive Sciences.

[298] John R. Anderson,et al. Tower of Hanoi: evidence for the cost of goal retrieval. , 2001, Journal of experimental psychology. Learning, memory, and cognition.

[299] G. Schoenbaum,et al. Integrating orbitofrontal cortex into prefrontal theory: common processing themes across species and subdivisions. , 2001, Learning & memory.

[300] D. Kahneman,et al. Functional Imaging of Neural Responses to Expectancy and Experience of Monetary Gains and Losses tasks with monetary payoffs , 2001 .

[301] Samuel M. McClure,et al. Predictability Modulates Human Brain Response to Reward , 2001, The Journal of Neuroscience.

[302] J. Townsend,et al. Multialternative Decision Field Theory: A Dynamic Connectionist Model of Decision Making , 2001 .

[303] J. Gold,et al. Neural computations that underlie decisions about sensory stimuli , 2001, Trends in Cognitive Sciences.

[304] P. Gollwitzer,et al. Reflective and reflexive action control in patients with frontal brain lesions. , 2001, Neuropsychology.

[305] E. Miller,et al. An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.

[306] Finn V. Jensen,et al. Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[307] Peter Dayan,et al. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .

[308] A. Nobre,et al. Hunger selectively modulates corticolimbic activation to food stimuli in humans. , 2001, Behavioral neuroscience.

[309] Jacob Feldman,et al. Minimization of Boolean complexity in human concept learning , 2000, Nature.

[310] E. Miller,et al. Task-specific neural activity in the primate prefrontal cortex. , 2000, Journal of neurophysiology.

[311] R. Kesner. Subregional analysis of mnemonic functions of the prefrontal cortex in the rat , 2000, Psychobiology.

[312] T. Shallice,et al. CONTENTION SCHEDULING AND THE CONTROL OF ROUTINE ACTIVITIES , 2000, Cognitive neuropsychology.

[313] E. Murray,et al. Control of Response Selection by Reinforcer Value Requires Interaction of Amygdala and Orbital Prefrontal Cortex , 2000, The Journal of Neuroscience.

[314] J. Hollerman,et al. Reward processing in primate orbitofrontal cortex and basal ganglia. , 2000, Cerebral cortex.

[315] C. Cavada,et al. The anatomical connections of the macaque monkey orbitofrontal cortex. A review. , 2000, Cerebral cortex.

[316] H. Bekkering,et al. Imitation of gestures in children is goal-directed. , 2000, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[317] C. Glymour. The Mind's Arrows: Bayes Nets and Graphical Causal Models in Psychology , 2000 .

[318] R. Malenka,et al. Dopaminergic modulation of neuronal excitability in the striatum and nucleus accumbens. , 2000, Annual review of neuroscience.

[319] Jonathan D. Cohen,et al. Conflict monitoring versus selection-for-action in anterior cingulate cortex , 1999, Nature.

[320] Kenji Doya,et al. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.

[321] Michael L. Platt,et al. Neural correlates of decision variables in parietal cortex , 1999, Nature.

[322] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[323] S. Wise,et al. Rule-dependent neuronal activity in the prefrontal cortex , 1999, Experimental Brain Research.

[324] W. Schultz,et al. Relative reward preference in primate orbitofrontal cortex , 1999, Nature.

[325] A. Borst. Seeing smells: imaging olfactory learning in bees , 1999, Nature Neuroscience.

[326] B. Jones. BOUNDED RATIONALITY , 1999 .

[327] Kevin Murphy,et al. Bayes net toolbox for Matlab , 1999 .

[328] Michael I. Jordan. Learning in Graphical Models , 1999, NATO ASI Series.

[329] Jeffrey N. Rouder,et al. Modeling Response Times for Two-Choice Decisions , 1998 .

[330] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[331] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.

[332] J. Staddon,et al. A Dynamic Route Finder for the Cognitive Map , 1998 .

[333] A. Graybiel. The Basal Ganglia and Chunking of Action Repertoires , 1998, Neurobiology of Learning and Memory.

[334] W. Newsome,et al. The Variable Discharge of Cortical Neurons: Implications for Connectivity, Computation, and Information Coding , 1998, The Journal of Neuroscience.

[335] B. Balleine,et al. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.

[336] B. Balleine,et al. The role of incentive learning in instrumental outcome revaluation by sensory-specific satiety , 1998 .

[337] B. Balleine,et al. Consciousness—the interface between affect and cognition , 1998 .

[338] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[339] C. Braun,et al. Event-Related Brain Potentials Following Incorrect Feedback in a Time-Estimation Task: Evidence for a Generic Neural System for Error Detection , 1997, Journal of Cognitive Neuroscience.

[340] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.

[341] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[342] D H Brainard,et al. The Psychophysics Toolbox. , 1997, Spatial vision.

[343] V. Benassi,et al. Illusion of control: A meta-analytic review. , 1996 .

[344] J. Duncan,et al. Intelligence and the Frontal Lobe: The Organization of Goal-Directed Behavior , 1996, Cognitive Psychology.

[345] R W Cox,et al. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. , 1996, Computers and biomedical research, an international journal.

[346] RU Muller,et al. The hippocampus as a cognitive graph , 1996, The Journal of general physiology.

[347] P. Dayan,et al. A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[348] E. Rolls,et al. THE ORBITOFRONTAL CORTEX. DISCUSSION , 1996 .

[349] Michael W. Montgomery,et al. Analysis of a disorder of everyday action , 1995 .

[350] Michael I. Jordan,et al. An internal model for sensorimotor integration. , 1995, Science.

[351] R. H. S. Carpenter,et al. Neural computation of log likelihood in control of saccadic eye movements , 1995, Nature.

[352] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.

[353] S. Kobayashi,et al. Electrophysiologic correlates of visuo-spatial attention shift. , 1995, Electroencephalography and clinical neurophysiology.

[354] J. Grafman,et al. Are the frontal lobes implicated in “planning” functions? Interpreting data from the Tower of Hanoi , 1995, Neuropsychologia.

[355] A. Barto. Adaptive Critics and the Basal Ganglia , 1995 .

[356] J. Wickens,et al. Cellular models of reinforcement. , 1995 .

[357] David Mumford,et al. Neuronal Architectures for Pattern-theoretic Problems , 1995 .

[358] John G. Taylor,et al. Route Finding by Neural Nets , 1995 .

[359] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[360] B. Balleine,et al. Role of cholecystokinin in the motivational control of instrumental action in rats. , 1994, Behavioral neuroscience.

[361] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[362] Joel L. Davis,et al. A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .

[363] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.

[364] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[365] Bartlett W. Mel. Synaptic integration in an excitable dendritic tree. , 1993, Journal of neurophysiology.

[366] J. Townsend,et al. Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. , 1993, Psychological review.

[367] B. Balleine,et al. Actions and responses: The dual psychology of behaviour. , 1993 .

[368] B. Balleine. Instrumental performance following a shift in primary motivation depends on incentive learning. , 1992, Journal of experimental psychology. Animal behavior processes.

[369] Bartlett W. Mel. NMDA-Based Pattern Discrimination in a Modeled Cortical Neuron , 1992, Neural Computation.

[370] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[371] Ross D. Shachter,et al. Decision Making Using Probabilistic Inference Methods , 1992, UAI.

[372] T. Poggio,et al. Multiplying with synapses and neurons , 1992 .

[373] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[374] D Mumford,et al. On the computational architecture of the neocortex. II. The role of cortico-cortical loops. , 1992, Biological cybernetics.

[375] T. Shallice,et al. Deficits in strategy application following frontal lobe damage in man. , 1991, Brain : a journal of neurology.

[376] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[377] John R. Anderson. The Adaptive Character of Thought , 1990 .

[378] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[379] Ross D. Shachter,et al. Dynamic programming and influence diagrams , 1990, IEEE Trans. Syst. Man Cybern..

[380] J. Talairach,et al. Co-Planar Stereotaxic Atlas of the Human Brain: 3-Dimensional Proportional System: An Approach to Cerebral Imaging , 1988 .

[381] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[382] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[383] K. H. Britten,et al. Neuronal correlates of a perceptual decision , 1989, Nature.

[384] A. Dickinson,et al. Reinforcer specificity of the suppression of instrumental performance on a non-contingent schedule , 1989, Behavioural Processes.

[385] B. Williams. The effects of response contingency and reinforcement identity on response suppression by alternative reinforcement , 1989 .

[386] B. Balleine,et al. Incentive learning and the motivational control of instrumental performance by thirst , 1989 .

[387] G. Micheletti. The Prefrontal Cortex. Anatomy, Physiology and Neuropsychology of the Frontal Lobe, Fuster J.M.. Raven Press, New York (1989) , 1989 .

[388] Gregory F. Cooper,et al. A Method for Using Belief Networks as Influence Diagrams , 2013, UAI 1988.

[389] R. Rescorla,et al. The role of response-reinforcer associations increases throughout extended instrumental training , 1988 .

[390] Allen Newell,et al. SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..

[391] K. Wilcox,et al. Stimulation of the lateral habenula inhibits dopamine-containing neurons in the substantia nigra and ventral tegmental area of the rat , 1986, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[392] R. Rescorla,et al. Associative Structures In Instrumental Learning , 1986 .

[393] R. Rescorla,et al. Instrumental responding remains sensitive to reinforcer devaluation after extensive training , 1985 .

[394] J. Busemeyer. Decision making under uncertainty: a comparison of simple scalability, fixed-sample, and sequential-sampling models. , 1985, Journal of experimental psychology. Learning, memory, and cognition.

[395] A. Dickinson. Actions and habits: the development of behavioural autonomy , 1985 .

[396] R. Rescorla,et al. Postconditioning devaluation of a reinforcer affects instrumental responding. , 1985 .

[397] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[398] R. Weale. Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[399] T. Shallice. Specific impairments of planning. , 1982, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[400] Christopher D. Adams. Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .

[401] Christopher D. Adams,et al. Instrumental Responding following Reinforcer Devaluation , 1981 .

[402] A G Barto,et al. Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[403] R. Passingham. The hippocampus as a cognitive map J. O'Keefe & L. Nadel, Oxford University Press, Oxford (1978). 570 pp., £25.00 , 1979, Neuroscience.

[404] K. Spence. Behavior Theory and Conditioning , 1978 .

[405] Roger Ratcliff,et al. A Theory of Memory Retrieval. , 1978 .

[406] C. Manski. The structure of random utility models , 1977 .

[407] Allen Newell,et al. Human Problem Solving. , 1973 .

[408] H. Simon,et al. Perception in chess , 1973 .

[409] A. Tversky. Elimination by aspects: A theory of choice. , 1972 .

[410] R. Rescorla. A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[411] H B Barlow,et al. PATTERN RECOGNITION AND THE RESPONSES OF SENSORY NEURONS * , 1969, Annals of the New York Academy of Sciences.