Which Temporal Difference Learning Algorithm Best Reproduces Dopamine Activity in a Multi-choice Task?
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[2] O. Hikosaka,et al. Two types of dopamine neuron distinctly convey positive and negative motivational signals , 2009, Nature.
[3] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[4] W. Schultz. Predictive reward signal of dopamine neurons. , 1998, Journal of neurophysiology.
[5] Y. Niv,et al. Dialogues on prediction errors , 2008, Trends in Cognitive Sciences.
[6] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[7] E. Vaadia,et al. Midbrain dopamine neurons encode decisions for future action , 2006, Nature Neuroscience.
[8] P. Glimcher,et al. Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.
[9] Saori C. Tanaka,et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops , 2004, Nature Neuroscience.
[10] J. Hollerman,et al. Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.
[11] M. Roesch,et al. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards , 2007, Nature Neuroscience.
[12] Amir Dezfouli,et al. Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..
[13] N. Daw. Dopamine: at the intersection of reward and action , 2007, Nature Neuroscience.