Timing and Partial Observability in the Dopamine System
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .
[2] Yann Guédon,et al. Explicit state occupancy modelling by hidden semi-Markov models: application of Derin's scheme , 1990 .
[3] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[4] Joel L. Davis,et al. A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .
[5] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[6] P. Dayan,et al. A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[7] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[8] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[9] J. Hollerman,et al. Dopamine neurons report an error in the temporal prediction of reward during learning , 1998, Nature Neuroscience.
[10] W. Schultz,et al. A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task , 1999, Neuroscience.
[11] Peter Dayan,et al. Acquisition in Autoshaping , 1999, NIPS.
[12] C. Gallistel,et al. Time, rate, and conditioning. , 2000, Psychological review.
[13] David S. Touretzky,et al. Modeling Temporal Structure in Classical Conditioning , 2001, NIPS.
[14] Peter Dayan,et al. Motivated Reinforcement Learning , 2001, NIPS.
[15] R. Suri. Anticipatory responses of dopamine neurons and cortical neurons reproduced by internal model , 2001, Experimental Brain Research.
[16] E. Tira-Thompson. Combining Configural and TD Learning on a Robot , 2002, ICDL 2002.
[17] David S. Touretzky,et al. Long-Term Reward Prediction in TD Models of the Dopamine System , 2002, Neural Computation.