Environmental statistics and the trade-off between model-based and TD learning in humans
暂无分享,去创建一个
[1] M. Gluck,et al. Interactive memory systems in the human brain , 2001, Nature.
[2] Larry King,et al. Feedback and task predictability as determinants of performance in multiple cue probability learning tasks , 1976 .
[3] Berndt Brehmer,et al. Task information and performance in probabilistic inference tasks , 1978 .
[4] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[5] A. Bechara. Decision making, impulse control and loss of willpower to resist drugs: a neurocognitive perspective , 2005, Nature Neuroscience.
[6] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[7] Peter Dayan,et al. Temporal difference models describe higher-order learning in humans , 2004, Nature.
[8] Amir Dezfouli,et al. Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..
[9] Kenji Doya,et al. Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics , 2006, Neural Networks.
[10] B. Balleine,et al. Multiple Forms of Value Learning and the Function of Dopamine , 2009 .
[11] Kenji Doya,et al. What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? , 1999, Neural Networks.
[12] A. David Redish,et al. Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model , 2005, Neural Networks.
[13] Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
[14] F. Toates. The interaction of cognitive and stimulus–response processes in the control of behaviour , 1997, Neuroscience & Biobehavioral Reviews.
[15] B. Balleine,et al. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.
[16] Karl J. Friston,et al. Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.
[17] Peter Dayan,et al. Goal-directed control and its antipodes , 2009, Neural Networks.
[18] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[19] P. Glimcher,et al. JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .
[20] W Todd Maddox,et al. Category number impacts rule-based but not information-integration category learning: further evidence for dissociable category-learning systems. , 2004, Journal of experimental psychology. Learning, memory, and cognition.
[21] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.
[22] M. Gluck,et al. Probabilistic classification learning in amnesia. , 1994, Learning & memory.
[23] W. T. Maddox,et al. Dissociating explicit and procedural-learning based systems of perceptual category learning , 2004, Behavioural Processes.