Deep Reinforcement Learning and Its Neuroscientific Implications
暂无分享,去创建一个
Zeb Kurth-Nelson | Kevin J. Miller | Matthew Botvinick | Kevin J. Miller | Will Dabney | Jane X. Wang | Jane X. Wang | Will Dabney | M. Botvinick | Z. Kurth-Nelson
[1] 慧慧 周,et al. Algorithmic Research on Exploring Neural Networks with Activation Atlases , 2022, Software Engineering and Applications.
[2] Marlos C. Machado,et al. Exploration in Reinforcement Learning with Deep Covering Options , 2020, ICLR.
[3] Adam Santoro,et al. Backpropagation and the brain , 2020, Nature Reviews Neuroscience.
[4] Richard Naud,et al. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits , 2020, Nature Neuroscience.
[5] I. Momennejad. Learning Structures: Predictive Representations, Replay, and Generalization , 2020, Current Opinion in Behavioral Sciences.
[6] Daniel Guo,et al. Never Give Up: Learning Directed Exploration Strategies , 2020, ICLR.
[7] Demis Hassabis,et al. MEMO: A Deep Network for Flexible Combination of Episodic Memories , 2020, ICLR.
[8] A. Nieder,et al. Dopamine Gates Visual Signals in Monkey Prefrontal Cortex Neurons. , 2020, Cell reports.
[9] Zeb Kurth-Nelson,et al. A distributional code for value in dopamine-based reinforcement learning , 2020, Nature.
[10] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[11] Razvan Pascanu,et al. Stabilizing Transformers for Reinforcement Learning , 2019, ICML.
[12] Caswell Barry,et al. The Tolman-Eichenbaum Machine: Unifying Space and Relational Memory through Generalization in the Hippocampal Formation , 2019, Cell.
[13] Uri Hasson,et al. Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks , 2019, Neuron.
[14] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[15] Doina Precup,et al. The Option Keyboard: Combining Skills in Reinforcement Learning , 2021, NeurIPS.
[16] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[17] Greg Wayne,et al. Hierarchical motor control in mammals and machines , 2019, Nature Communications.
[18] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[19] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[20] Surya Ganguli,et al. A deep learning framework for neuroscience , 2019, Nature Neuroscience.
[21] Stephen Clark,et al. Emergent Systematic Generalization in a Situated Agent , 2019, ICLR 2020.
[22] George Konidaris,et al. On the necessity of abstraction , 2019, Current Opinion in Behavioral Sciences.
[23] Anthony M. Zador,et al. A critique of pure learning and what artificial neural networks can learn from animal brains , 2019, Nature Communications.
[24] Alyssa A. Carey,et al. Reward revaluation biases hippocampal replay content away from the preferred outcome , 2019, Nature Neuroscience.
[25] Andrew R. Mitz,et al. Subcortical Substrates of Explore-Exploit Decisions in Primates , 2019, Neuron.
[26] Maneesh Sahani,et al. A neurally plausible model learns successor representations in partially observable environments , 2019, NeurIPS.
[27] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[28] Alexander Lerchner,et al. COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration , 2019, ArXiv.
[29] Jane X. Wang,et al. Reinforcement Learning, Fast and Slow , 2019, Trends in Cognitive Sciences.
[30] Radoslaw Martin Cichy,et al. Deep Neural Networks as Scientific Models , 2019, Trends in Cognitive Sciences.
[31] C. Olah,et al. Activation Atlas , 2019, Distill.
[32] James C. R. Whittington,et al. Theories of Error Back-Propagation in the Brain , 2019, Trends in Cognitive Sciences.
[33] Doina Precup,et al. The Termination Critic , 2019, AISTATS.
[34] Marc G. Bellemare,et al. A Comparative Analysis of Expected and Distributional Reinforcement Learning , 2019, AAAI.
[35] Zeb Kurth-Nelson,et al. Causal Reasoning from Meta-reinforcement Learning , 2019, ArXiv.
[36] Tom Eccles,et al. An investigation of model-free planning , 2019, ICML.
[37] Marc G. Bellemare,et al. An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents , 2018, IJCAI.
[38] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[39] H. Francis Song,et al. Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[40] Yan Wu,et al. Optimizing agent behavior over long time scales by transporting value , 2018, Nature Communications.
[41] Nicolas Heess,et al. Hierarchical visuomotor control of humanoids , 2018, ICLR.
[42] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[43] H. Francis Song,et al. Relational Forward Models for Multi-Agent Learning , 2018, ICLR.
[44] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[45] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[46] Yoshua Bengio,et al. Dendritic cortical microcircuits approximate the backpropagation algorithm , 2018, NeurIPS.
[47] Zeb Kurth-Nelson,et al. What Is a Cognitive Map? Organizing Knowledge for Flexible Behavior , 2018, Neuron.
[48] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.
[49] Zeb Kurth-Nelson,et al. Been There, Done That: Meta-Learning with Episodic Recall , 2018, ICML.
[50] Razvan Pascanu,et al. Vector-based navigation using grid-like representations in artificial agents , 2018, Nature.
[51] Daniel L. K. Yamins,et al. A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy , 2018, Neuron.
[52] Nathalie L Rochefort,et al. Action and learning shape the activity of neuronal circuits in the visual cortex , 2018, Current Opinion in Neurobiology.
[53] Samy Bengio,et al. A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.
[54] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[55] Joel Z. Leibo,et al. Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.
[56] S. Gershman. Deconstructing the human algorithms for exploration , 2018, Cognition.
[57] Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.
[58] Jürgen Schmidhuber,et al. World Models , 2018, ArXiv.
[59] H. Francis Song,et al. Machine Theory of Mind , 2018, ICML.
[60] Marco Baroni,et al. Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.
[61] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[62] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[63] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[64] Marcelo G Mattar,et al. Prioritized memory access explains planning and hippocampal replay , 2017, Nature Neuroscience.
[65] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[66] N. Uchida,et al. Neural Circuitry of Reward Prediction Error. , 2017, Annual review of neuroscience.
[67] Jonathan D. Cohen,et al. Toward a Rational and Mechanistic Account of Mental Effort. , 2017, Annual review of neuroscience.
[68] Christopher Burgess,et al. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.
[69] D. Hassabis,et al. Neuroscience-Inspired Artificial Intelligence , 2017, Neuron.
[70] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[71] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[72] Ari Weinstein,et al. Structure Learning in Motor Control: A Deep Reinforcement Learning Model , 2017, CogSci.
[73] Chethan Pandarinath,et al. Inferring single-trial neural population dynamics using sequential auto-encoders , 2017, Nature Methods.
[74] Kimberly L. Stachenfeld,et al. The hippocampus as a predictive map , 2017, Nature Neuroscience.
[75] K. Norman,et al. Reinstated episodic context guides sampling-based decisions for reward , 2017, Nature Neuroscience.
[76] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[77] Razvan Pascanu,et al. Metacontrol for Adaptive Imagination-Based Optimization , 2017, ICLR.
[78] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[79] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[80] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[81] N. Daw,et al. Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework , 2017, Annual review of psychology.
[82] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[83] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[84] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[85] Misha Denil,et al. Learning to Perform Physics Experiments via Deep Reinforcement Learning , 2016, ICLR.
[86] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[87] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[88] Marcel van Gerven,et al. Modeling the Dynamics of Human Brain Activity with Recurrent Neural Networks , 2016, Front. Comput. Neurosci..
[89] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[90] P. Dayan,et al. Adaptive integration of habits into depth-limited planning defines a habitual-goal–directed spectrum , 2016, Proceedings of the National Academy of Sciences.
[91] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[92] Rafal Bogacz,et al. Learning Reward Uncertainty in the Basal Ganglia , 2016, PLoS Comput. Biol..
[93] Xiao-Jing Wang,et al. Reward-based training of recurrent neural networks for cognitive and value-based tasks , 2016, bioRxiv.
[94] James L. McClelland,et al. What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , 2016, Trends in Cognitive Sciences.
[95] Timothy E. J. Behrens,et al. Organizing conceptual knowledge in humans with a gridlike code , 2016, Science.
[96] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.
[97] Konrad P. Körding,et al. Toward an Integration of Deep Learning and Neuroscience , 2016, bioRxiv.
[98] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[99] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.
[100] Christopher D. Harvey,et al. Recurrent Network Models of Sequence Generation and Memory , 2016, Neuron.
[101] J. DiCarlo,et al. Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.
[102] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[103] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[104] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[105] Nikolaus Kriegeskorte,et al. Deep neural networks: a new framework for modelling biological vision and brain information processing , 2015, bioRxiv.
[106] F. Cushman,et al. Habitual control of goal selection in humans , 2015, Proceedings of the National Academy of Sciences.
[107] Alec Solway,et al. Reinforcement learning, efficient coding, and the statistics of natural tasks , 2015, Current Opinion in Behavioral Sciences.
[108] Matthew T. Kaufman,et al. A neural network that finds a naturalistic solution for the production of muscle activity , 2015, Nature Neuroscience.
[109] G. Schoenbaum,et al. What the orbitofrontal cortex does not do , 2015, Nature Neuroscience.
[110] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[111] Christopher H Chatham,et al. Multiple gates on working memory , 2015, Current Opinion in Behavioral Sciences.
[112] Jonathan D. Cohen,et al. Humans use directed and random exploration to solve the explore-exploit dilemma. , 2014, Journal of experimental psychology. General.
[113] Jonathan D. Cohen,et al. The Computational and Neural Basis of Cognitive Control: Charted Territory and New Frontiers , 2014, Cogn. Sci..
[114] Ha Hong,et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.
[115] Robert C. Wilson,et al. Orbitofrontal Cortex as a Cognitive Map of Task Space , 2014, Neuron.
[116] Shinsuke Shimojo,et al. Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.
[117] N. Daw,et al. Multiple Systems for Value Learning , 2014 .
[118] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[119] W. Newsome,et al. Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.
[120] P. Dayan,et al. Goals and Habits in the Brain , 2013, Neuron.
[121] Raymond J. Dolan,et al. Exploration, novelty, surprise, and free energy minimization , 2013, Front. Psychol..
[122] Brad E. Pfeiffer,et al. Hippocampal place cell sequences depict future paths to remembered goals , 2013, Nature.
[123] M. Botvinick,et al. Neural representations of events arise from temporal community structure , 2013, Nature Neuroscience.
[124] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[125] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[126] D. Shohamy,et al. Preference by Association: How Memory Mechanisms in the Hippocampus Bias Decisions , 2012, Science.
[127] H. Seo,et al. Neural basis of reinforcement learning and decision making. , 2012, Annual review of neuroscience.
[128] Anne G E Collins,et al. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis , 2012, The European journal of neuroscience.
[129] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[130] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[131] P. Glimcher. Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis , 2011, Proceedings of the National Academy of Sciences.
[132] Simon Hong,et al. A pallidus-habenula-dopamine pathway signals inferred stimulus values. , 2010, Journal of neurophysiology.
[133] Lee Spector,et al. Genetic Programming for Reward Function Search , 2010, IEEE Transactions on Autonomous Mental Development.
[134] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[135] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.
[136] Matthijs A. A. van der Meer,et al. Hippocampal Replay Is Not a Simple Function of Experience , 2010, Neuron.
[137] B. Balleine,et al. Human and Rodent Homologies in Action Control: Corticostriatal Determinants of Goal-Directed and Habitual Action , 2010, Neuropsychopharmacology.
[138] D. Blei,et al. Context, learning, and extinction. , 2010, Psychological review.
[139] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[140] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[141] Y. Niv. Reinforcement learning in the brain , 2009 .
[142] Geoffrey E. Hinton,et al. Deep, Narrow Sigmoid Belief Networks Are Universal Approximators , 2008, Neural Computation.
[143] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[144] David Badre,et al. Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes , 2008, Trends in Cognitive Sciences.
[145] D. Heeger,et al. A Hierarchy of Temporal Receptive Windows in Human Cortex , 2008, The Journal of Neuroscience.
[146] Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
[147] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[148] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[149] C. Padoa-Schioppa,et al. Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.
[150] M. Frank,et al. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. , 2006, Psychological review.
[151] Michael J. Frank,et al. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.
[152] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[153] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[154] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[155] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[156] P. Dayan,et al. Reward, Motivation, and Reinforcement Learning , 2002, Neuron.
[157] David J. Freedman,et al. Categorical representation of visual stimuli in the primate prefrontal cortex. , 2001, Science.
[158] H. Stanley,et al. Optimizing the success of random searches , 1999, Nature.
[159] H. Eichenbaum,et al. The Hippocampus, Memory, and Place Cells Is It Spatial Memory or a Memory Space? , 1999, Neuron.
[160] Rajesh P. N. Rao,et al. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .
[161] Pieter R. Roelfsema,et al. Object-based attention in the primary visual cortex of the macaque monkey , 1998, Nature.
[162] B. Balleine,et al. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates , 1998, Neuropharmacology.
[163] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.
[164] B. McNaughton,et al. Reactivation of hippocampal ensemble memories during sleep. , 1994, Science.
[165] Deborah Silver,et al. Feature Visualization , 1994, Scientific Visualization.
[166] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[167] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[168] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[169] David Zipser,et al. Recurrent Network Model of the Neural Mechanism of Short-Term Active Memory , 1991, Neural Computation.
[170] Richard A. Andersen,et al. A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.
[171] Teuvo Kohonen,et al. Self-Organization and Associative Memory , 1988 .
[172] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[173] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[174] D. Hubel,et al. Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.
[175] F. Attneave,et al. The Organization of Behavior: A Neuropsychological Theory , 1949 .