A neural model of hierarchical reinforcement learning
暂无分享,去创建一个
[1] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[2] EliasmithChris. A Unified Approach to Building and Controlling Spiking Attractor Networks , 2005 .
[3] Charles L. Lawson,et al. Solving least squares problems , 1976, Classics in applied mathematics.
[4] Jonathan D. Cohen,et al. Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement , 2008, NIPS.
[5] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[6] Y. Niv. Reinforcement learning in the brain , 2009 .
[7] Carlos Diuk,et al. Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia , 2013, The Journal of Neuroscience.
[8] P. Dayan,et al. Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.
[9] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[10] C. Eliasmith,et al. Dynamic Behaviour of a Spiking Model of Action Selection in the Basal Ganglia Neural Structure , 2010 .
[11] Samuel J. Gershman,et al. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines , 2015, Science.
[12] M. D’Esposito,et al. Frontal Cortex and the Discovery of Abstract Action Rules , 2010, Neuron.
[13] Nicolas P. Rougier,et al. Learning representations in a gated prefrontal cortex model of dynamic task switching , 2002, Cogn. Sci..
[14] W. Senn,et al. Reinforcement learning in populations of spiking neurons , 2008, Nature Neuroscience.
[15] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[16] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[17] P. Redgrave,et al. The basal ganglia: a vertebrate solution to the selection problem? , 1999, Neuroscience.
[18] Chris Eliasmith,et al. A Unified Approach to Building and Controlling Spiking Attractor Networks , 2005, Neural Computation.
[19] Terrence C. Stewart,et al. Neuroinformatics Original Research Article Python Scripting in the Nengo Simulator , 2022 .
[20] Mitsuo Kawato,et al. Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning , 2006, Neural Networks.
[21] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[22] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[23] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[24] Chris Eliasmith,et al. Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems , 2004, IEEE Transactions on Neural Networks.
[25] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[26] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[27] J. Cohen,et al. Dopamine, cognitive control, and schizophrenia: the gating model. , 1999, Progress in brain research.
[28] Daniel Rasmussen. Hierarchical reinforcement learning in a biologically plausible neural architecture , 2014 .
[29] Andrew G. Barto,et al. Behavioral Hierarchy: Exploration and Representation , 2013, Computational and Robotic Models of the Hierarchical Organization of Behavior.
[30] William W. Lytton,et al. Reinforcement Learning of Two-Joint Virtual Arm Reaching in a Computer Model of Sensorimotor Cortex , 2013, Neural Computation.
[31] Clay B. Holroyd,et al. Motivation of extended behaviors by anterior cingulate cortex , 2012, Trends in Cognitive Sciences.
[32] Satinder P. Singh,et al. Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes , 1994, AAAI.
[33] David J. Foster,et al. A model of hippocampally dependent navigation, using the temporal difference learning rule , 2000, Hippocampus.
[34] W. Schultz. Predictive reward signal of dopamine neurons. , 1998, Journal of neurophysiology.
[35] James Kozloski,et al. Self-referential forces are sufficient to explain different dendritic morphologies , 2013, Front. Neuroinform..
[36] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[37] Joseph T. McGuire,et al. A Neural Signature of Hierarchical Reinforcement Learning , 2011, Neuron.
[38] Lilianne R. Mujica-Parodi,et al. Ventral striatal and medial prefrontal BOLD activation is correlated with reward-related electrocortical activity: A combined ERP and fMRI study , 2011, NeuroImage.
[39] C. Eliasmith,et al. Learning to Select Actions with Spiking Neurons in the Basal Ganglia , 2012, Front. Neurosci..
[40] Barry D. Nichols. Reinforcement learning in continuous state- and action-space , 2014 .
[41] D. Plaut,et al. Doing without schema hierarchies: a recurrent connectionist approach to normal and impaired routine sequential action. , 2004, Psychological review.
[42] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[43] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[44] Markus Diesmann,et al. A Spiking Neural Network Model of an Actor-Critic Learning Agent , 2009, Neural Computation.
[45] Razvan V. Florian,et al. Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity , 2007, Neural Computation.
[46] Eytan Ruppin,et al. Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.
[47] Wulfram Gerstner,et al. Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons , 2013, PLoS Comput. Biol..
[48] Ari Weinstein,et al. Model-based hierarchical reinforcement learning and human action control , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[49] Ron Meir,et al. Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule , 2007, Neural Computation.
[50] Walter Senn,et al. Spatio-Temporal Credit Assignment in Neuronal Population Learning , 2011, PLoS Comput. Biol..
[51] Chris Eliasmith,et al. A spiking neural model applied to the study of human performance and cognitive decline on Raven's Advanced , 2014 .
[52] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[53] Chris Eliasmith,et al. A neural reinforcement learning model for tasks with unknown time delays , 2013, CogSci.
[54] Michael J. Frank,et al. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.
[55] Joseph J. Paton,et al. A Scalable Population Code for Time in the Striatum , 2015, Current Biology.
[56] J. Hollerman,et al. Reward processing in primate orbitofrontal cortex and basal ganglia. , 2000, Cerebral cortex.
[57] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[58] Peter Redgrave,et al. A computational model of action selection in the basal ganglia. I. A new functional anatomy , 2001, Biological Cybernetics.
[59] Iris van Rooij,et al. Hierarchies in Action and Motor Control , 2012, Journal of Cognitive Neuroscience.
[60] Chris Eliasmith,et al. Fine-Tuning and the Stability of Recurrent Neural Networks , 2011, PloS one.
[61] Matthew Botvinick,et al. Divide and Conquer: Hierarchical Reinforcement Learning and Task Decomposition in Humans , 2013, Computational and Robotic Models of the Hierarchical Organization of Behavior.
[62] C. Lawson,et al. Solving least squares problems , 1976, Classics in applied mathematics.
[63] Trevor Bekolay,et al. Nengo: a Python tool for building large-scale functional brain models , 2014, Front. Neuroinform..
[64] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[65] Bernhard Hengst,et al. Hierarchical Approaches , 2012, Reinforcement Learning.
[66] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[67] M. Frank,et al. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. , 2012, Cerebral cortex.
[68] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[69] Anne G E Collins,et al. Cognitive control over learning: creating, clustering, and generalizing task-set structure. , 2013, Psychological review.
[70] Marco Mirolli,et al. Computational and Robotic Models of the Hierarchical Organization of Behavior , 2013, Springer Berlin Heidelberg.
[71] M Botvinick,et al. Doing without schema hierarchies: A connectionist approach to routine sequential action and its pathology , 2000 .
[72] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[73] Wulfram Gerstner,et al. Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail , 2009, PLoS Comput. Biol..
[74] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[75] Samuel M. McClure,et al. Hierarchical control over effortful behavior by rodent medial frontal cortex: A computational model. , 2015, Psychological review.
[76] Ronald A. Howard,et al. Dynamic Probabilistic Systems , 1971 .
[77] Robert C. Wilson,et al. Orbitofrontal Cortex as a Cognitive Map of Task Space , 2014, Neuron.
[78] G. Schoenbaum,et al. Neural Encoding in Orbitofrontal Cortex and Basolateral Amygdala during Olfactory Discrimination Learning , 1999, The Journal of Neuroscience.
[79] Thomas E. Hazy,et al. PVLV: the primary value and learned value Pavlovian learning algorithm. , 2007, Behavioral neuroscience.
[80] Markus Werning,et al. Compositionality and Biologically Plausible Models , 2009 .
[81] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.
[82] E. Izhikevich. Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.