Re-aligning models of habitual and goal-directed decision-making
暂无分享,去创建一个
Giovanni Pezzulo | Amitai Shenhav | Kevin J. Miller | Elliot Andrew Ludvig | Elliot A. Ludvig | G. Pezzulo | A. Shenhav
[1] Wendy Wood,et al. Psychology of Habit. , 2016, Annual review of psychology.
[2] S. Glover. Planning and control in action , 2004, Behavioral and Brain Sciences.
[3] R. Dolan,et al. Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making , 2015, Proceedings of the National Academy of Sciences.
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] Gregory Ashby,et al. A neuropsychological theory of multiple systems in category learning. , 1998, Psychological review.
[6] H. Yin,et al. The role of the basal ganglia in habit formation , 2006, Nature Reviews Neuroscience.
[7] N. Daw,et al. Generalization of value in reinforcement learning by humans , 2012, The European journal of neuroscience.
[8] M. Botvinick,et al. Planning as inference , 2012, Trends in Cognitive Sciences.
[9] Alice Y. Chiang,et al. Working-memory capacity protects model-based learning from stress , 2013, Proceedings of the National Academy of Sciences.
[10] Karl J. Friston,et al. Computational psychiatry , 2012, Trends in Cognitive Sciences.
[11] E. Tolman. Cognitive maps in rats and men. , 1948, Psychological review.
[12] P. Dayan,et al. Mapping value based planning and extensively trained choice in the human brain , 2012, Nature Neuroscience.
[13] Karl J. Friston,et al. Active Inference, homeostatic regulation and adaptive behavioural control , 2015, Progress in Neurobiology.
[14] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[15] Donald A. Norman,et al. Attention to Action , 1986 .
[16] N. Daw,et al. The ubiquity of model-based reinforcement learning , 2012, Current Opinion in Neurobiology.
[17] B. Balleine,et al. Habits, action sequences and reinforcement learning , 2012, The European journal of neuroscience.
[18] T. Robbins,et al. Reliance on habits at the expense of goal-directed control following dopamine precursor depletion , 2011, Psychopharmacology.
[19] Walter Schneider,et al. Controlled and automatic human information processing: II. Perceptual learning, automatic attending and a general theory. , 1977 .
[20] Makoto Ito,et al. Model-based action planning involves cortico-cerebellar and basal ganglia networks , 2016, Scientific Reports.
[21] N. Daw,et al. Model-based learning protects against forming habits , 2015, Cognitive, Affective, & Behavioral Neuroscience.
[22] N. Daw,et al. Characterizing a psychiatric symptom dimension related to deficits in goal-directed control , 2016, eLife.
[23] A G Barto,et al. Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.
[24] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[25] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[26] The formation of habits in the neocortex under the implicit supervision of the basal ganglia , 2015, BMC Neuroscience.
[27] A. Markman,et al. The Curse of Planning: Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive , 2013 .
[28] Hannah M. Batchelor,et al. Dopamine Neurons Respond to Errors in the Prediction of Sensory Features of Expected Rewards , 2017, Neuron.
[29] Wouter Kool,et al. Cost-Benefit Arbitration Between Multiple Reinforcement-Learning Systems , 2017, Psychological science.
[30] A. Markman,et al. Journal of Experimental Psychology : General Retrospective Revaluation in Sequential Decision Making : A Tale of Two Systems , 2012 .
[31] Joel L. Davis,et al. A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .
[32] P. Phillips,et al. Subsecond dopamine fluctuations in human striatum encode superposed error signals about actual and counterfactual reward , 2015, Proceedings of the National Academy of Sciences.
[33] Giovanni Pezzulo,et al. Divide et impera: subgoaling reduces the complexity of probabilistic inference and problem solving , 2015, Journal of The Royal Society Interface.
[34] Roshan Cools,et al. Habitual versus Goal-directed Action Control in Parkinson Disease , 2011, Journal of Cognitive Neuroscience.
[35] Joshua W. Brown,et al. Medial prefrontal cortex as an action-outcome predictor , 2011, Nature Neuroscience.
[36] W. Seeley. Attention and Cognitive Control in Affective Perception for Embodied Appraisals , 2013 .
[37] G. Schoenbaum,et al. Transition from ‘model-based’ to ‘model-free’ behavioral control in addiction: Involvement of the orbitofrontal cortex and dorsolateral striatum , 2014, Neuropharmacology.
[38] Kyle S. Smith,et al. A dual operator view of habitual behavior reflecting cortical and striatal dynamics. , 2013, Neuron.
[39] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[40] Ali Ghazizadeh,et al. Parallel basal ganglia circuits for decision making , 2018, Journal of Neural Transmission.
[41] D. Hassabis,et al. Neural Mechanisms of Hierarchical Planning in a Virtual Subway Network , 2016, Neuron.
[42] Jonathan Evans. Dual-processing accounts of reasoning, judgment, and social cognition. , 2008, Annual review of psychology.
[43] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.
[44] Z. Kurth-Nelson,et al. A theoretical account of cognitive effects in delay discounting , 2012, The European journal of neuroscience.
[45] L. J. Hammond. The effect of contingency upon the appetitive conditioning of free-operant behavior. , 1980, Journal of the experimental analysis of behavior.
[46] Christopher D. Adams,et al. The Effect of the Instrumental Training Contingency on Susceptibility to Reinforcer Devaluation , 1983 .
[47] Kevin J. Miller,et al. Habits without Values , 2016, bioRxiv.
[48] P. Janak,et al. Defining the place of habit in substance use disorders , 2017, Progress in Neuro-Psychopharmacology and Biological Psychiatry.
[49] Peter Dayan,et al. Bonsai Trees in Your Head: How the Pavlovian System Sculpts Goal-Directed Choices by Pruning Decision Trees , 2012, PLoS Comput. Biol..
[50] C. L. Hull. Principles of behavior : an introduction to behavior theory , 1943 .
[51] Karl J. Friston,et al. Neuroscience and Biobehavioral Reviews , 2022 .
[52] M. Botvinick,et al. The successor representation in human reinforcement learning , 2016, bioRxiv.
[53] Wendy Wood,et al. Habit and intention in everyday life: The multiple processes by which past behavior predicts future behavior. , 1998 .
[54] Karl J. Friston,et al. Hierarchical Active Inference: A Theory of Motivated Control , 2018, Trends in Cognitive Sciences.
[55] J. Buckholtz. Social norms, self-control, and the value of antisocial behavior , 2015, Current Opinion in Behavioral Sciences.
[56] Hilbert J. Kappen,et al. Risk Sensitive Path Integral Control , 2010, UAI.
[57] G. Oettingen. Future thought and behaviour change , 2012 .
[58] Peter Dayan,et al. Interplay of approximate planning strategies , 2015, Proceedings of the National Academy of Sciences.
[59] W. T. Maddox,et al. Annals of the New York Academy of Sciences Human Category Learning 2.0 Brief Review of First-generation Research , 2022 .
[60] Andrea Brovelli,et al. Advanced Parkinson's disease effect on goal-directed and habitual processes involved in visuomotor associative learning , 2013, Front. Hum. Neurosci..
[61] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[62] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[63] Simon Hong,et al. A pallidus-habenula-dopamine pathway signals inferred stimulus values. , 2010, Journal of neurophysiology.
[64] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.
[65] N. Daw,et al. Multiple Systems for Value Learning , 2014 .
[66] Seth A. Herd,et al. Goal-Driven Cognition in the Brain: A Computational Framework , 2014, 1404.7591.
[67] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[68] Giovanni Pezzulo,et al. The Mixed Instrumental Controller: Using Value of Information to Combine Habitual Choice and Mental Simulation , 2013, Front. Psychol..
[69] A. Graybiel. Habits, rituals, and the evaluative brain. , 2008, Annual review of neuroscience.
[70] B. Balleine,et al. Human and Rodent Homologies in Action Control: Corticostriatal Determinants of Goal-Directed and Habitual Action , 2010, Neuropsychopharmacology.
[71] P. Dayan,et al. Adaptive integration of habits into depth-limited planning defines a habitual-goal–directed spectrum , 2016, Proceedings of the National Academy of Sciences.
[72] Amir Dezfouli,et al. Habits as action sequences: hierarchical action control and changes in outcome value , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[73] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[74] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[75] Samuel J. Gershman,et al. Predictive representations can link model-based reinforcement learning to model-free mechanisms , 2017 .
[76] S. Killcross,et al. Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.
[77] F. Cushman. Action, Outcome, and Value , 2013, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.
[78] Shinsuke Shimojo,et al. Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.
[79] Eric B. Baum,et al. What is thought? , 2003 .
[80] Wouter Kool,et al. When Does Model-Based Control Pay Off? , 2016, PLoS Comput. Biol..
[81] A. Rangel. Regulation of dietary choice by the decision-making circuitry , 2013, Nature Neuroscience.
[82] P. Gollwitzer,et al. Planning and the Control of Action , 2017 .
[83] F. Cushman,et al. Habitual control of goal selection in humans , 2015, Proceedings of the National Academy of Sciences.
[84] David T. Neal,et al. A new look at habits and the habit-goal interface. , 2007, Psychological review.
[85] Elke U. Weber,et al. Correcting expected utility for comparisons between alternative outcomes: A unified parameterization of regret and disappointment , 2008 .
[86] Daniel B. Willingham,et al. A Neuropsychological Theory of Motor Skill Learning , 2004 .
[87] K. Newell. Motor skill acquisition. , 1991, Annual review of psychology.
[88] R. Dolan,et al. Dopamine Enhances Model-Based over Model-Free Choice Behavior , 2012, Neuron.
[89] P. Dayan,et al. The algorithmic anatomy of model-based evaluation , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[90] Geoffrey Schoenbaum,et al. Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework , 2016, eLife.
[91] M. Crockett. Models of morality , 2013, Trends in Cognitive Sciences.
[92] John M. Ennis,et al. A neurobiological theory of automaticity in perceptual categorization. , 2007, Psychological review.
[93] M. Botvinick. Hierarchical reinforcement learning and decision making , 2012, Current Opinion in Neurobiology.
[94] N. Daw,et al. Dopamine selectively remediates 'model-based' reward learning: a computational approach. , 2016, Brain : a journal of neurology.
[95] P. Dayan,et al. Goals and Habits in the Brain , 2013, Neuron.
[96] Sébastien Hélie,et al. A Neurocomputational Model of Automatic Sequence Production , 2015, Journal of Cognitive Neuroscience.
[97] H. Simon,et al. Models of Bounded Rationality: Economic Analysis and Public Policy , 1984 .
[98] Amir Dezfouli,et al. Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..
[99] R. Costa,et al. Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions , 2013, Nature Communications.
[100] D. Spalding. The Principles of Psychology , 1873, Nature.
[101] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[102] C. Shea,et al. Motor skill learning and performance: a review of influential factors , 2010, Medical education.
[103] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[104] E. Thorndike. Animal Intelligence; Experimental Studies , 2009 .
[105] H. Aarts,et al. Habits as knowledge structures: automaticity in goal-directed behavior. , 2000, Journal of personality and social psychology.
[106] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[107] Christopher D. Adams. Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .
[108] Karl J. Friston,et al. Active Inference: A Process Theory , 2017, Neural Computation.
[109] Jonathan D. Cohen,et al. Toward a Rational and Mechanistic Account of Mental Effort. , 2017, Annual review of neuroscience.
[110] A. Dickinson. Actions and habits: the development of behavioural autonomy , 1985 .
[111] J. O'Doherty,et al. Regret and its avoidance: a neuroimaging study of choice behavior , 2005, Nature Neuroscience.
[112] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[113] N. Daw,et al. Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning , 2016, The Journal of Neuroscience.