When Does Reward Maximization Lead to Matching Law?
暂无分享,去创建一个
[1] D. W. Hands. The Matching Law: Papers In Psychology And Economics , 1999 .
[2] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[3] Saori C. Tanaka,et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops , 2004, Nature Neuroscience.
[4] W Vaughan,et al. Melioration, matching, and maximization. , 1981, Journal of the experimental analysis of behavior.
[5] M. Davison,et al. The matching law: A research review. , 1988 .
[6] W. Baum,et al. Matching, undermatching, and overmatching in studies of choice. , 1979, Journal of the experimental analysis of behavior.
[7] H. Seung,et al. JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 581–617 NUMBER 3(NOVEMBER) LINEAR-NONLINEAR-POISSON MODELS OF PRIMATE CHOICE DYNAMICS , 2022 .
[8] W. Newsome,et al. Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.
[9] L. T. DeCarlo. Matching and maximizing with variable-time schedules. , 1985, Journal of the experimental analysis of behavior.
[10] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[11] D. Shanks,et al. A Re-examination of Probability Matching and Rational Choice , 2002 .
[12] G M Heyman,et al. A Markov model description of changeover probabilities on concurrent variable-interval schedules. , 1979, Journal of the experimental analysis of behavior.
[13] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[14] M. Davison,et al. Effects of varying stimulus disparity and the reinforcer ratio in concurrent-schedule and signal-detection procedures. , 1991, Journal of the experimental analysis of behavior.
[15] G M Heyman,et al. Is matching compatible with reinforcement maximization on concurrent variable interval variable ratio? , 1979, Journal of the experimental analysis of behavior.
[16] R. Herrnstein,et al. CHAPTER 5 – Melioration and Behavioral Allocation1 , 1980 .
[17] J. E. Mazur. Optimization theory fails to predict performance of pigeons in a two-response situation. , 1981, Science.
[18] R J HERRNSTEIN,et al. Relative and absolute strength of response as a function of frequency of reinforcement. , 1961, Journal of the experimental analysis of behavior.
[19] O. Hikosaka. Models of information processing in the basal Ganglia edited by James C. Houk, Joel L. Davis and David G. Beiser, The MIT Press, 1995. $60.00 (400 pp) ISBN 0 262 08234 9 , 1995, Trends in Neurosciences.
[20] M. Davison,et al. Sensitivity of time allocation to an overall reinforcer rate feedback function in concurrent interval schedules. , 1989, Journal of the experimental analysis of behavior.
[21] J. Nevin,et al. Stimuli, reinforcers, and behavior: an integration. , 1999, Journal of the experimental analysis of behavior.
[22] Xiao-Jing Wang,et al. A Biophysically Based Neural Model of Matching Law Behavior: Melioration by Stochastic Synapses , 2006, The Journal of Neuroscience.
[23] Yutaka Sakai,et al. The Actor-Critic Learning Is Behind the Matching Law: Matching Versus Optimal Behaviors , 2008, Neural Computation.
[24] J. Staddon,et al. Limits to action, the allocation of individual behavior , 1982 .
[25] W M Baum,et al. Optimization and the matching law as accounts of instrumental behavior. , 1981, Journal of the experimental analysis of behavior.
[26] P. Chance. Learning and Behavior , 1979 .
[27] John N. Tsitsiklis,et al. Simulation-based optimization of Markov reward processes , 2001, IEEE Trans. Autom. Control..
[28] Yonatan Loewenstein,et al. Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity , 2006, Proceedings of the National Academy of Sciences.
[29] Yutaka Sakai,et al. Computational algorithms and neuronal network models underlying decision processes , 2006, Neural Networks.