Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework
暂无分享,去创建一个
[1] W. Brogden. Sensory pre-conditioning. , 1939 .
[2] E. Tolman. Cognitive maps in rats and men. , 1948, Psychological review.
[3] A. Tversky,et al. Prospect theory: analysis of decision under risk , 1979 .
[4] R. Passingham. The hippocampus as a cognitive map J. O'Keefe & L. Nadel, Oxford University Press, Oxford (1978). 570 pp., £25.00 , 1979, Neuroscience.
[5] Christopher D. Adams. Variations in the Sensitivity of Instrumental Responding to Reinforcer Devaluation , 1982 .
[6] John G. Lynch,et al. Memory and Attentional Factors in Consumer Choice: Concepts and Research Methods , 1982 .
[7] R. Nosofsky. Attention, similarity, and the identification-categorization relationship. , 1986, Journal of experimental psychology. General.
[8] Dale T. Miller,et al. Norm theory: Comparing reality to its alternatives , 1986 .
[9] Christopher K. Riesbeck,et al. Inside Case-Based Reasoning , 1989 .
[10] G. E. Alexander,et al. Functional architecture of basal ganglia circuits: neural substrates of parallel processing , 1990, Trends in Neurosciences.
[11] P. Nedungadi. Recall and Consumer Consideration Sets: Influencing Choice without Altering Brand Evaluations , 1990 .
[12] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[13] L. Squire. Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. , 1992, Psychological review.
[14] J. Kruschke,et al. ALCOVE: an exemplar-based connectionist model of category learning. , 1992, Psychological review.
[15] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[16] A. Tversky,et al. Choice in Context: Tradeoff Contrast and Extremeness Aversion , 1992 .
[17] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[18] Joel L. Davis,et al. A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .
[19] P. Dayan,et al. A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[20] Jennifer A. Mangels,et al. A Neostriatal Habit Learning System in Humans , 1996, Science.
[21] J. D. McGaugh,et al. Inactivation of Hippocampus or Caudate Nucleus with Lidocaine Differentially Affects Expression of Place and Response Learning , 1996, Neurobiology of Learning and Memory.
[22] B. McNaughton,et al. Replay of Neuronal Firing Sequences in Rat Hippocampus During Sleep Following Spatial Experience , 1996, Science.
[23] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[24] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[25] J. Gabrieli. Cognitive neuroscience of human memory. , 1998, Annual review of psychology.
[26] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.
[27] David J. Foster,et al. A model of hippocampally dependent navigation, using the temporal difference learning rule , 2000, Hippocampus.
[28] Michael Kearns,et al. Bias-Variance Error Bounds for Temporal Difference Updates , 2000, COLT.
[29] Itzhak Gilboa,et al. A theory of case-based decisions , 2001 .
[30] N. Cohen. From Conditioning to Conscious Recollection Memory Systems of the Brain. Oxford Psychology Series, Volume 35. , 2001 .
[31] M. Gluck,et al. Interactive memory systems in the human brain , 2001, Nature.
[32] Jonathan D. Cohen,et al. Computational perspectives on dopamine function in prefrontal cortex , 2002, Current Opinion in Neurobiology.
[33] B. Balleine,et al. The Role of Learning in the Operation of Motivational Systems , 2002 .
[34] Cleotilde Gonzalez,et al. Instance-based learning in dynamic decision making , 2003, Cogn. Sci..
[35] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[36] I. Erev,et al. Small feedback‐based decisions and their limited correspondence to description‐based decisions , 2003 .
[37] Thomas Gärtner,et al. Kernels and Distances for Structured Data , 2004, Machine Learning.
[38] Michael J. Frank,et al. By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.
[39] A. Redish,et al. Addiction as a Computational Process Gone Awry , 2004, Science.
[40] D. Medin,et al. SUSTAIN: a network model of category learning. , 2004, Psychological review.
[41] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[42] Gordon D. A. Brown,et al. Absolute identification by relative judgment. , 2005, Psychological review.
[43] M. Gluck,et al. The role of dopamine in cognitive sequence learning: evidence from Parkinson’s disease , 2005, Behavioural Brain Research.
[44] T. Robbins,et al. Neural systems of reinforcement for drug addiction: from actions to habits to compulsion , 2005, Nature Neuroscience.
[45] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[46] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[47] P. Glimcher,et al. Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.
[48] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[49] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[50] P. Glimcher,et al. JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .
[51] George Loewenstein,et al. Mistake #37: The Effect of Previously Encountered Prices on Current Housing Demand , 2006 .
[52] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[53] Michael J. Frank,et al. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.
[54] R. Dolan,et al. Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans , 2006, Nature.
[55] Gordon D. A. Brown,et al. Decision by sampling , 2006, Cognitive Psychology.
[56] David S. Touretzky,et al. Representation and Timing in Theories of the Dopamine System , 2006, Neural Computation.
[57] A. Tversky,et al. Prospect theory: an analysis of decision under risk — Source link , 2007 .
[58] Adam Johnson,et al. Neural Ensembles in CA3 Transiently Encode Paths Forward of the Animal at a Decision Point , 2007, The Journal of Neuroscience.
[59] Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
[60] Jonathan D. Cohen,et al. On the Control of Control: The Role of Dopamine in Regulating Prefrontal Function and Working Memory , 2007 .
[61] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[62] Timothy J. Pleskac,et al. The Description-Experience Gap in Risky Choice: The Role of Sample Size and Experienced Probabilities , 2008 .
[63] Kent Berridge,et al. Faculty Opinions recommendation of Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. , 2008 .
[64] D. Shohamy,et al. Integrating Memories in the Human Brain: Hippocampal-Midbrain Encoding of Overlapping Events , 2008, Neuron.
[65] Richard S. Sutton,et al. Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System , 2008, Neural Computation.
[66] Colin Camerer,et al. Dissociating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors , 2008, The Journal of Neuroscience.
[67] Jonathan D. Cohen,et al. Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement , 2008, NIPS.
[68] E. Yechiam,et al. Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions† , 2008 .
[69] John R. Anderson,et al. Solving the credit assignment problem: explicit and implicit learning of action sequences with probabilistic outcomes , 2008, Psychological research.
[70] Eric A. Zilli,et al. Modeling the role of working memory and episodic memory in behavioral tasks , 2008, Hippocampus.
[71] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[72] B. McNaughton,et al. Hippocampus Leads Ventral Striatum in Replay of Place-Reward Information , 2009, PLoS biology.
[73] Y. Niv. Reinforcement learning in the brain , 2009 .
[74] R. Hertwig,et al. The description–experience gap in risky choice , 2009, Trends in Cognitive Sciences.
[75] Demis Hassabis,et al. The construction system of the brain , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.
[76] B. Schölkopf,et al. Does Cognitive Science Need Kernels? , 2009, Trends in Cognitive Sciences.
[77] I. Erev,et al. Learning, risk attitude and hot stoves in restless bandit problems , 2009 .
[78] D. Blei,et al. Context, learning, and extinction. , 2010, Psychological review.
[79] Rajesh P. N. Rao. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes , 2010, Front. Comput. Neurosci..
[80] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.
[81] Nathaniel D. Daw,et al. Selective impairment of prediction error signaling in human dorsolateral but not ventral striatum in Parkinson's disease patients: evidence from a model-based fMRI study , 2010, NeuroImage.
[82] J. Tenenbaum,et al. Probabilistic models of cognition: exploring representations and inductive biases , 2010, Trends in Cognitive Sciences.
[83] Varun Dutt,et al. Instance-based learning: integrating sampling and repeated decisions from experience. , 2011, Psychological review.
[84] Matthijs A. A. van der Meer,et al. Theta Phase Precession in Rat Ventral Striatum Links Place and Reward Information , 2011, The Journal of Neuroscience.
[85] Nathaniel D. Daw,et al. Grid Cells, Place Cells, and Geodesic Generalization for Spatial Reinforcement Learning , 2011, PLoS Comput. Biol..
[86] Margaret F. Carr,et al. Hippocampal replay in the awake state: a potential substrate for memory consolidation and retrieval , 2011, Nature Neuroscience.
[87] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[88] Amir Dezfouli,et al. Speed/Accuracy Trade-Off between the Habitual and the Goal-Directed Processes , 2011, PLoS Comput. Biol..
[89] T. Robbins,et al. The hippocampal–striatal axis in learning, prediction and goal-directed behavior , 2011, Trends in Neurosciences.
[90] Katherine R. Sherrill,et al. The hippocampus is functionally connected to the striatum and orbitofrontal cortex during context dependent decision making , 2011, Brain Research.
[91] L. Davachi,et al. What Constitutes an Episode in Episodic Memory? , 2011, Psychological science.
[92] Anne E Carpenter,et al. Neuron-type specific signals for reward and punishment in the ventral tegmental area , 2011, Nature.
[93] P. Dayan,et al. Serotonin Selectively Modulates Reward Value in Human Decision-Making , 2012, The Journal of Neuroscience.
[94] R. N. Spreng,et al. The Future of Memory: Remembering, Imagining, and the Brain , 2012, Neuron.
[95] D. Shohamy,et al. Preference by Association: How Memory Mechanisms in the Hippocampus Bias Decisions , 2012, Science.
[96] Chantal E. Stern,et al. Cooperative interactions between hippocampal and striatal systems support flexible navigation , 2012, NeuroImage.
[97] Anne G E Collins,et al. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis , 2012, The European journal of neuroscience.
[98] Brad E. Pfeiffer,et al. Hippocampal place cell sequences depict future paths to remembered goals , 2013, Nature.
[99] Bernard W. Balleine,et al. Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized , 2013, PLoS Comput. Biol..
[100] N. Daw,et al. Chapter 15 – Value Learning through Reinforcement: The Basics of Dopamine and Reinforcement Learning , 2013 .
[101] Alice Y. Chiang,et al. Working-memory capacity protects model-based learning from stress , 2013, Proceedings of the National Academy of Sciences.
[102] Carlos Diuk,et al. Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia , 2013, The Journal of Neuroscience.
[103] A. Markman,et al. The Curse of Planning: Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive , 2013 .
[104] P. Dayan,et al. Goals and Habits in the Brain , 2013, Neuron.
[105] Josiah R. Boivin,et al. A Causal Link Between Prediction Errors, Dopamine Neurons and Learning , 2013, Nature Neuroscience.
[106] P. Glimcher,et al. Phasic Dopamine Release in the Rat Nucleus Accumbens Symmetrically Encodes a Reward Prediction Error Term , 2014, The Journal of Neuroscience.
[107] Erin Kendall Braun,et al. Episodic Memory Encoding Interferes with Reward Learning and Decreases Striatal Prediction Errors , 2014, The Journal of Neuroscience.
[108] Samuel Gershman,et al. Design Principles of the Hippocampal Cognitive Map , 2014, NIPS.
[109] Marcia L. Spetch,et al. Remembering the best and worst of times: Memories for extreme outcomes bias risky decisions , 2013, Psychonomic Bulletin & Review.
[110] Anne G E Collins,et al. Working Memory Contributions to Reinforcement Learning Impairments in Schizophrenia , 2014, The Journal of Neuroscience.
[111] P. Dayan,et al. The algorithmic anatomy of model-based evaluation , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[112] Thomas L. Griffiths,et al. The high availability of extreme events serves resource-rational decision-making , 2014, CogSci.
[113] A. Markman,et al. Journal of Experimental Psychology : General Retrospective Revaluation in Sequential Decision Making : A Tale of Two Systems , 2012 .
[114] Shinsuke Shimojo,et al. Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.
[115] N. Daw,et al. Multiple Systems for Value Learning , 2014 .
[116] N. Daw. Advanced Reinforcement Learning , 2014 .
[117] Dylan A. Simon,et al. Model-based choices involve prospective neural activity , 2015, Nature Neuroscience.
[118] N. Daw,et al. Integrating memories to guide decisions , 2015, Current Opinion in Behavioral Sciences.
[119] Peter Dayan,et al. Interplay of approximate planning strategies , 2015, Proceedings of the National Academy of Sciences.
[120] P. Dayan,et al. Temporal structure in associative retrieval , 2015, eLife.
[121] Samuel Gershman,et al. Novelty and Inductive Generalization in Human Reinforcement Learning , 2015, Top. Cogn. Sci..
[122] M. Botvinick,et al. Evidence integration in model-based tree search , 2015, Proceedings of the National Academy of Sciences.
[123] Y. Niv,et al. Discovering latent causes in reinforcement learning , 2015, Current Opinion in Behavioral Sciences.
[124] F. Cushman,et al. Habitual control of goal selection in humans , 2015, Proceedings of the National Academy of Sciences.
[125] Robert C. Wilson,et al. Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.
[126] Lesley K Fellows,et al. Ventromedial Frontal Cortex Is Critical for Guiding Attention to Reward-Predictive Visual Features in Humans , 2015, The Journal of Neuroscience.
[127] Marcia L Spetch,et al. Priming memories of past wins induces risk seeking. , 2015, Journal of experimental psychology. General.
[128] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[129] Ilana B. Witten,et al. Reward and choice encoding in terminals of midbrain dopamine neurons depends on striatal target , 2016, Nature Neuroscience.
[130] Geoffrey Schoenbaum,et al. Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework , 2016, eLife.
[131] Lindsay E. Hunter,et al. Episodic memories predict adaptive value-based decision-making. , 2016, Journal of experimental psychology. General.
[132] N. Daw,et al. Characterizing a psychiatric symptom dimension related to deficits in goal-directed control , 2016, eLife.
[133] N. Daw,et al. What’s past is present: Reminders of past choices bias decisions for reward in humans , 2017, bioRxiv.