Simple Artificial Neural Networks That Match Probability and Exploit and Explore When Confronting a Multiarmed Bandit
暂无分享,去创建一个
Michael R. W. Dawson | Brian Dupuis | Marcia Spetch | Debbie M. Kelly | Marcia L. Spetch | M. Spetch | D. Kelly | Brian Dupuis | Michael R.W. Dawson
[1] W. Estes,et al. Analysis of a verbal conditioning situation in terms of statistical learning theory , 1954 .
[2] M. Dawson,et al. Minds and Machines: Connectionism and Psychological Modeling , 2003 .
[3] A. A. Mullin,et al. Principles of neurodynamics , 1962 .
[4] N. Longo. PROBABILITY-LEARNING AND HABIT-REVERSAL IN THE COCKROACH. , 1964, The American journal of psychology.
[5] M. A. L. THATHACHAR,et al. A new approach to the design of reinforcement schemes for learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[6] Richard J. Herrnstein,et al. Derivatives of Matching. , 1979 .
[7] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[8] D. Danks. Equilibria of the Rescorla--Wagner model , 2003 .
[9] Christian M. Ernst,et al. Multi-armed Bandit Allocation Indices , 1989 .
[10] M. Kalish,et al. Connectionism: A Hands-On Approach, Michael R.W. Dawson. Blackwell (2005), £50.00 (hbk)/£19.99 (pbk), (200 pp.), ISBN: 1 405 13074 1 (hbk)/1 405 12807 0 , 2006 .
[11] M. Dawson,et al. Connectionism and Classical Conditioning , 2008 .
[12] Nir Vulkan. An Economist's Perspective on Probability Matching , 2000 .
[13] R. Herrnstein,et al. Toward a law of response strength. , 1976 .
[14] Isaac Meilijson,et al. Evolution of Reinforcement Learning in Uncertain Environments: A Simple Explanation for Complex Foraging Behaviors , 2002, Adapt. Behav..
[15] Massimo Piattelli-Palmarini,et al. Evolution, selection and cognition: From “learning” to parameter setting in biology and in the study of language , 1989, Cognition.
[16] M. Bitterman,et al. FURTHER EXPERIMENTS ON PROBABILITY-MATCHING IN THE PIGEON. , 1964, Journal of the experimental analysis of behavior.
[17] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[18] Michael R. W. Dawson,et al. Connectionist Selectionism: A Case Study of Parity , 2005 .
[19] Zahra Ansari,et al. The Quantitative Law of Effect is a Robust Emergent Property of an Evolutionary Algorithm for Reinforcement Learning , 2005, ECAL.
[20] Michael R. W. Dawson,et al. Autonomous processing in parallel distributed processing networks , 1992 .
[21] Kevin D. Glazebrook,et al. Multi-Armed Bandit Allocation Indices: Gittins/Multi-Armed Bandit Allocation Indices , 2011 .
[22] J J McDowell,et al. On the classic and modern theories of matching. , 2005, Journal of the experimental analysis of behavior.
[23] R J HERRNSTEIN,et al. Relative and absolute strength of response as a function of frequency of reinforcement. , 1961, Journal of the experimental analysis of behavior.
[24] M. E. Bitterman,et al. Probability-Matching in the Fish , 1961 .
[25] A G Barto,et al. Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.
[26] D. W. Hands. The Matching Law: Papers In Psychology And Economics , 1999 .
[27] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[28] J. J. McDowell,et al. Undermatching is an emergent property of selection by consequences , 2007, Behavioural Processes.
[29] David C. Palmer,et al. Learning and Complex Behavior , 1993 .
[30] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[31] J. J. McDowell,et al. A computational theory of adaptive behavior based on an evolutionary reinforcement mechanism , 2006, GECCO.
[32] M. Davison,et al. The matching law: A research review. , 1988 .
[33] M. Bitterman,et al. Choice in honeybees as a function of the probability of reward , 1993 .
[34] J J McDowell,et al. A computational model of selection by consequences. , 2004, Journal of the experimental analysis of behavior.
[35] R. Herrnstein,et al. The Matching Law Papers in Psychology and Economics , 1997 .
[36] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[37] M E BITTERMAN,et al. Probability-Learning by the Turtle , 1965, Science.
[38] N. Newcombe,et al. Is there a geometric module for spatial orientation? squaring theory and evidence , 2005, Psychonomic bulletin & review.
[39] Richard J. Herrnstein,et al. MAXIMIZING AND MATCHING ON CONCURRENT RATIO SCHEDULES1 , 1975 .
[40] Tamar Keasar,et al. Bees in two-armed bandit situations: foraging choices and possible decision mechanisms , 2002 .
[41] R. Herrnstein,et al. Maximizing and matching on concurrent ratio schedules. , 1975, Journal of the experimental analysis of behavior.
[42] David R. Shanks,et al. The Psychology of Associative Learning , 1995 .
[43] R. Herrnstein,et al. Toward a law of response strength. , 1976 .
[44] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[45] W M Baum,et al. On two types of deviation from the matching law: bias and undermatching. , 1974, Journal of the experimental analysis of behavior.