The Successor Representation as a model of behavioural flexibility
暂无分享,去创建一个
[1] Giovanni Pezzulo,et al. The Mixed Instrumental Controller: Using Value of Information to Combine Habitual Choice and Mental Simulation , 2013, Front. Psychol..
[2] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[3] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[4] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[5] John R. Anderson,et al. Navigating complex decision spaces: Problems and paradigms in sequential choice. , 2014, Psychological bulletin.
[6] P. Dayan. The Convergence of TD(λ) for General λ , 2004, Machine Learning.
[7] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[8] Amir Dezfouli,et al. Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..
[9] E. Tolman. Cognitive maps in rats and men. , 1948, Psychological review.
[10] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[11] Kevin A. Gluck,et al. SAwSu: An Integrated Model of Associative and Reinforcement Learning , 2014, Cogn. Sci..
[12] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.
[13] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[14] Per B. Sederberg,et al. The Successor Representation and Temporal Context , 2012, Neural Computation.
[15] Peter Dayan,et al. Motivated Reinforcement Learning , 2001, NIPS.
[16] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[17] Samuel J. Gershman,et al. Predictive representations can link model-based reinforcement learning to model-free mechanisms , 2017 .
[18] Shinsuke Shimojo,et al. Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.
[19] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[20] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[21] Samuel Gershman,et al. Design Principles of the Hippocampal Cognitive Map , 2014, NIPS.
[22] C. L. Hull. Differential habituation to internal stimuli in the albino rat. , 1933 .
[23] P. Dayan,et al. Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.
[24] Joel L. Davis,et al. Adaptive Critics and the Basal Ganglia , 1995 .
[25] A. Markman,et al. Journal of Experimental Psychology : General Retrospective Revaluation in Sequential Decision Making : A Tale of Two Systems , 2012 .
[26] R. Leeper. The Rôle of Motivation in Learning: A Study of the Phenomenon of Differential Motivational Control of the Utilization of Habits , 1935 .
[27] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[28] C. H. Honzik,et al. Degrees of hunger, reward and non-reward, and maze learning in rats, and Introduction and removal of reward, and maze performance in rats , 1930 .
[29] Kae Nakamura,et al. Predictive Reward Signal of Dopamine Neurons , 2015 .
[30] M. Botvinick,et al. The successor representation in human reinforcement learning , 2016, bioRxiv.
[31] W. Schultz. Multiple dopamine functions at different time courses. , 2007, Annual review of neuroscience.
[32] B. Balleine,et al. Multiple Forms of Value Learning and the Function of Dopamine , 2009 .
[33] M. Khamassi,et al. Accounting for Negative Automaintenance in Pigeons: A Dual Learning Systems Approach and Factored Representations , 2014, PloS one.
[34] A. Christopoulos,et al. Fitting Models to Biological Data Using Linear and Nonlinear Regression: A Practical Guide to Curve Fitting , 2004 .
[35] K. Spence,et al. An experimental test of the sign-gestalt theory of trial and error learning. , 1946 .
[36] Olivier Sigaud,et al. Processus décisionnels de Markov en intelligence artificielle , 2008 .
[37] H. Blodgett,et al. The effect of the introduction of reward upon the maze performance of rats , 1929 .
[38] Michael H. Herzog,et al. What to Choose Next? A Paradigm for Testing Human Sequential Decision Making , 2017, Front. Psychol..
[39] B. Reynolds,et al. A repetition of the Blodgett experiment of latent learning. , 1945, Journal of experimental psychology.
[40] E. Koechlin. Prefrontal executive function and adaptive behavior in complex environments , 2016, Current Opinion in Neurobiology.
[41] P. Dayan,et al. The algorithmic anatomy of model-based evaluation , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[42] E. Thorndike. Animal intelligence; experimental studies, by Edward L. Thorndike. , 1911 .
[43] Mehdi Khamassi,et al. Modelling Individual Differences in the Form of Pavlovian Conditioned Approach Responses: A Dual Learning Systems Approach with Factored Representations , 2014, PLoS Comput. Biol..
[44] R. Dolan,et al. Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making , 2015, Proceedings of the National Academy of Sciences.
[45] H. Boland,et al. Experimental Investigation of ''Latent Learning'' in Mice , 1991 .
[46] Anne G E Collins,et al. Surprise! Dopamine signals mix action, value and error , 2015, Nature Neuroscience.
[47] E. Tolman. There is more than one kind of learning. , 1949, Psychological review.