Decision theory, reinforcement learning, and the brain
暂无分享,去创建一个
[1] D. M. Green,et al. Signal detection theory and psychophysics , 1966 .
[2] J. Andel. Sequential Analysis , 2022, The SAGE Encyclopedia of Research Design.
[3] R. Rescorla,et al. A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .
[4] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[5] P. Taylor,et al. Test of optimal sampling by foraging great tits , 1978 .
[6] A Houston,et al. The application of statistical decision theory to animal behaviour. , 1980, Journal of theoretical biology.
[7] James O. Berger. Statistical Decision Theory , 1980 .
[8] Sheldon M. Ross,et al. Introduction to Stochastic Dynamic Programming: Probability and Mathematical , 1983 .
[9] G. Pyke. Optimal Foraging Theory: A Critical Review , 1984 .
[10] Donald A. Berry,et al. Bandit Problems: Sequential Allocation of Experiments. , 1986 .
[11] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[12] J. Berger. Statistical Decision Theory and Bayesian Analysis , 1988 .
[13] C. Clark,et al. Dynamic Modeling in Behavioral Ecology , 2019 .
[14] C. Watkins. Learning from delayed rewards , 1989 .
[15] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[16] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[17] J. Movshon,et al. The analysis of visual motion: a comparison of neuronal and psychophysical performance , 1992, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[18] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[19] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[20] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[21] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[22] Karl J. Friston,et al. Value-dependent selection in the brain: Simulation in a synthetic neural model , 1994, Neuroscience.
[23] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[24] A. Barto,et al. Adaptive Critics and the Basal Ganglia , 1994 .
[25] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[26] Joel L. Davis,et al. Adaptive Critics and the Basal Ganglia , 1995 .
[27] A. Yuille,et al. Bayesian decision theory and psychophysics , 1996 .
[28] P. Dayan,et al. A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[29] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[30] J. Movshon,et al. A computational analysis of the relationship between neuronal and behavioral responses to visual motion , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[31] M N Shadlen,et al. Motion perception: seeing and deciding. , 1996, Proceedings of the National Academy of Sciences of the United States of America.
[32] K. H. Britten,et al. A relationship between behavioral choice and the visual responses of neurons in macaque MT , 1996, Visual Neuroscience.
[33] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[34] W. Schultz,et al. Learning of sequential movements by neural network model with dopamine-like reinforcement signal , 1998, Experimental Brain Research.
[35] A. Parker,et al. Sense and the single neuron: probing the physiology of perception. , 1998, Annual review of neuroscience.
[36] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[37] Jeffrey N. Rouder,et al. Modeling Response Times for Two-Choice Decisions , 1998 .
[38] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[39] Alexandre Pouget,et al. Probabilistic Interpretation of Population Codes , 1996, Neural Computation.
[40] Michael L. Platt,et al. Neural correlates of decision variables in parietal cortex , 1999, Nature.
[41] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[42] R. Jacobs,et al. Optimal integration of texture and motion cues to depth , 1999, Vision Research.
[43] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[44] James L. McClelland,et al. The time course of perceptual choice: the leaky, competing accumulator model. , 2001, Psychological review.
[45] Peter Dayan,et al. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .
[46] J. Gold,et al. Neural computations that underlie decisions about sensory stimuli , 2001, Trends in Cognitive Sciences.
[47] M. Ernst,et al. Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.
[48] W. Schultz. Getting Formal with Dopamine and Reward , 2002, Neuron.
[49] J. Gold,et al. Banburismus and the Brain Decoding the Relationship between Sensory Stimuli, Decisions, and Reward , 2002, Neuron.
[50] Xiao-Jing Wang,et al. Probabilistic Decision Making by Slow Reverberation in Cortical Circuits , 2002, Neuron.
[51] Kenji Doya,et al. Metalearning and neuromodulation , 2002, Neural Networks.
[52] Eytan Ruppin,et al. Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.
[53] M. El-Sabaawi. Breakdown of Will , 2002 .
[54] Thomas G. Dietterich,et al. Editors. Advances in Neural Information Processing Systems , 2002 .
[55] M. Shadlen,et al. Response of Neurons in the Lateral Intraparietal Area during a Combined Visual Discrimination Reaction Time Task , 2002, The Journal of Neuroscience.
[56] Peter Dayan,et al. Dopamine: generalization and bonuses , 2002, Neural Networks.
[57] Robert A Jacobs,et al. Bayesian integration of visual and auditory signals for spatial localization. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.
[58] P. Glimcher. Decisions, Uncertainty, and the Brain: The Science of Neuroeconomics , 2003 .
[59] Karl J. Friston,et al. Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.
[60] Michael S Landy,et al. Statistical decision theory and the selection of rapid, goal-directed movements. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.
[61] M. Landy,et al. Statistical decision theory and trade-offs in the control of motor response. , 2003, Spatial vision.
[62] A. Bechara. Decisions, Uncertainty, and the Brain: The Science of Neuroeconomics , 2003 .
[63] S. Killcross,et al. Coordination of actions and habits in the medial prefrontal cortex of rats. , 2003, Cerebral cortex.
[64] Peter Dayan,et al. Doubly Distributional Population Codes: Simultaneous Representation of Uncertainty and Multiplicity , 2003, Neural Computation.
[65] Terrence J. Sejnowski,et al. Exploration Bonuses and Dual Control , 1996, Machine Learning.
[66] Karl J. Friston,et al. Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.
[67] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[68] Philip L. Smith,et al. Psychology and neurobiology of simple decisions , 2004, Trends in Neurosciences.
[69] Samuel M. McClure,et al. Separate Neural Systems Value Immediate and Delayed Monetary Rewards , 2004, Science.
[70] Rajesh P. N. Rao. Bayesian Computation in Recurrent Neural Circuits , 2004, Neural Computation.
[71] K. Doya,et al. A Neural Correlate of Reward-Based Behavioral Learning in Caudate Nucleus: A Functional Magnetic Resonance Imaging Study of a Stochastic Decision Task , 2004, The Journal of Neuroscience.
[72] Peter Dayan,et al. Temporal difference models describe higher-order learning in humans , 2004, Nature.
[73] Konrad Paul Kording,et al. Bayesian integration in sensorimotor learning , 2004, Nature.
[74] Jonathan D. Cohen,et al. An exploration-exploitation model based on norepinepherine and dopamine activity , 2005, NIPS.
[75] T. Robbins,et al. Neural systems of reinforcement for drug addiction: from actions to habits to compulsion , 2005, Nature Neuroscience.
[76] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[77] J. Wickens,et al. Striatal dopamine in motor activation and reward-mediated learning: steps towards a unifying model , 2005, Journal of Neural Transmission / General Section JNT.
[78] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[79] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[80] S. Hyman,et al. Neural mechanisms of addiction: the role of reward-related learning and memory. , 2006, Annual review of neuroscience.
[81] K. Berridge. The debate over dopamine’s role in reward: the case for incentive salience , 2007, Psychopharmacology.
[82] Anthony J. Movshon,et al. Optimal representation of sensory information by neural populations , 2006, Nature Neuroscience.
[83] Wei Ji Ma,et al. Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.
[84] Eero P. Simoncelli,et al. Noise characteristics and prior expectations in human visual speed perception , 2006, Nature Neuroscience.
[85] P. Dayan,et al. Cortical substrates for exploratory decisions in humans , 2006, Nature.
[86] X-J Wang,et al. Toward a Prefrontal Microcircuit Model for Cognitive Deficits in Schizophrenia , 2006, Pharmacopsychiatry.
[87] Angela J. Yu. Optimal Change-Detection and Spiking Neurons , 2006, NIPS.
[88] M. Landy,et al. Humans Rapidly Estimate Expected Gain in Movement Planning , 2006, Psychological science.
[89] Richard E. Turner,et al. Probabilistic Population Codes , 2006 .
[90] Jonathan D. Cohen,et al. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. , 2006, Psychological review.
[91] S. Ishii,et al. Resolution of Uncertainty in Prefrontal Cortex , 2006, Neuron.
[92] David S. Touretzky,et al. Representation and Timing in Theories of the Dopamine System , 2006, Neural Computation.
[93] K. Doya,et al. The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.
[94] Xiao-Jing Wang,et al. Cortico–basal ganglia circuit mechanism for a decision threshold in reaction time tasks , 2006, Nature Neuroscience.
[95] B. Balleine,et al. The Role of the Dorsal Striatum in Reward and Decision-Making , 2007, The Journal of Neuroscience.
[96] Alexandre Pouget,et al. Exact Inferences in a Neural Implementation of a Hidden Markov Model , 2007, Neural Computation.
[97] Michael N. Shadlen,et al. Probabilistic reasoning by neurons , 2007, Nature.
[98] Angela J. Yu,et al. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.
[99] Michael N. Shadlen,et al. The Speed and Accuracy of a Simple Perceptual Decision: A Mathematical Primer. , 2007 .
[100] J. Gold,et al. The neural basis of decision making. , 2007, Annual review of neuroscience.
[101] A. Pouget,et al. Probabilistic population codes and the exponential family of distributions. , 2007, Progress in brain research.
[102] J. Horvitz,et al. Dopaminergic Mechanisms in Actions and Habits , 2007, The Journal of Neuroscience.
[103] R. Costa. Plastic Corticostriatal Circuits for Action Learning , 2007, Annals of the New York Academy of Sciences.
[104] R. Wightman,et al. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens , 2007, Nature Neuroscience.
[105] Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
[106] P. Glimcher,et al. The neural correlates of subjective value during intertemporal choice , 2007, Nature Neuroscience.
[107] Konrad Paul Kording,et al. Decision Theory: What "Should" the Nervous System Do? , 2007, Science.
[108] M. Roesch,et al. Should I Stay or Should I Go? , 2007 .
[109] Roger Ratcliff,et al. The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks , 2008, Neural Computation.
[110] P. Redgrave,et al. What is reinforced by phasic dopamine signals? , 2008, Brain Research Reviews.
[111] Sophie Denève,et al. Bayesian Spiking Neurons I: Inference , 2008, Neural Computation.
[112] M. Sahani,et al. Implicit knowledge of visual uncertainty guides decisions with asymmetric outcomes. , 2008, Journal of vision.
[113] N. Daw,et al. Striatal Activity Underlies Novelty-Based Choice in Humans , 2008, Neuron.
[114] J. Crotts. Why Choose this Book? How we make decisions , 2008 .