Reward-based training of recurrent neural networks for cognitive and value-based tasks
暂无分享,去创建一个
Xiao-Jing Wang | H Francis Song | Guangyu R Yang | H. F. Song | Xiao-Jing Wang | G. R. Yang | Xiao-Jing Wang
[1] Xiao-Jing Wang,et al. Synaptic computation underlying probabilistic inference , 2010, Nature Neuroscience.
[2] Xiao-Jing Wang,et al. The importance of mixed selectivity in complex cognitive tasks , 2013, Nature.
[3] Y. Niv,et al. Silencing the Critics: Understanding the Effects of Cocaine Sensitization on Dorsolateral and Ventral Striatum in the Context of an Actor/Critic Model , 2008, Front. Neurosci..
[4] K. Koketsu,et al. Cholinergic and inhibitory synapses in a pathway from motor‐axon collaterals to motoneurones , 1954, The Journal of physiology.
[5] N. Parga,et al. Dynamic Control of Response Criterion in Premotor Cortex during Perceptual Detection under Temporal Uncertainty , 2015, Neuron.
[6] Daniel L. K. Yamins,et al. Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..
[7] Jonathan Baxter,et al. Learning internal representations , 1995, COLT '95.
[8] P. Glimcher,et al. Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.
[9] Pieter R. Roelfsema,et al. Reinforcement Learning of Linking and Tracing Contours in Recurrent Neural Networks , 2015, PLoS Comput. Biol..
[10] Jonathan D. Cohen,et al. Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement , 2008, NIPS.
[11] Xiao-Jing Wang,et al. A Recurrent Network Mechanism of Time Integration in Perceptual Decisions , 2006, The Journal of Neuroscience.
[12] M. Shadlen,et al. Response of Neurons in the Lateral Intraparietal Area during a Combined Visual Discrimination Reaction Time Task , 2002, The Journal of Neuroscience.
[13] W. Gerstner,et al. Optimal Control of Transient Dynamics in Balanced Networks Supports Generation of Complex Movements , 2014, Neuron.
[14] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[15] Joel L. Davis,et al. A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement , 1994 .
[16] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[17] Yael Niv,et al. Reinforcement learning with Marr , 2016, Current Opinion in Behavioral Sciences.
[18] Wolfgang Maass,et al. Emergence of complex computational structures from chaotic neural networks through reward-modulated Hebbian learning. , 2014, Cerebral cortex.
[19] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[20] N. Daw,et al. Signals in Human Striatum Are Appropriate for Policy Update Rather than Value Prediction , 2011, The Journal of Neuroscience.
[21] R. Romo,et al. Neuronal correlates of parametric working memory in the prefrontal cortex , 1999, Nature.
[22] Azad Adam,et al. The Functional Requirements , 2007 .
[23] Xiao-Jing Wang,et al. Probabilistic Decision Making by Slow Reverberation in Cortical Circuits , 2002, Neuron.
[24] Francesca Mastrogiuseppe,et al. Intrinsically-generated fluctuating activity in excitatory-inhibitory networks , 2016, PLoS Comput. Biol..
[25] David Sussillo,et al. Opening the Black Box: Low-Dimensional Dynamics in High-Dimensional Recurrent Neural Networks , 2013, Neural Computation.
[26] Henning Sprekeler,et al. Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity , 2010, The Journal of Neuroscience.
[27] G. Schoenbaum,et al. What the orbitofrontal cortex does not do , 2015, Nature Neuroscience.
[28] O. Hikosaka,et al. Basal ganglia circuits for reward value-guided behavior. , 2014, Annual review of neuroscience.
[29] J. Wallis. Orbitofrontal cortex and its contribution to decision-making. , 2007, Annual review of neuroscience.
[30] Richard A. Andersen,et al. A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.
[31] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[32] Jessica Lowell. Neural Network , 2001 .
[33] Patrice Simardy,et al. Learning Long-Term Dependencies with , 2007 .
[34] Xiao-Jing Wang,et al. Internal Representation of Task Rules by Recurrent Dynamics: The Importance of the Diversity of Neural Responses , 2010, Front. Comput. Neurosci..
[35] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[36] Surya Ganguli,et al. On simplicity and complexity in the brave new world of large-scale neuroscience , 2015, Current Opinion in Neurobiology.
[37] Joel L. Davis,et al. Adaptive Critics and the Basal Ganglia , 1995 .
[38] M. Frank,et al. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. , 2006, Psychological review.
[39] C. Padoa-Schioppa,et al. Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.
[40] David Sussillo,et al. Neural circuits as computational dynamical systems , 2014, Current Opinion in Neurobiology.
[41] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[42] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[43] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[44] Máté Lengyel,et al. Goal-Directed Decision Making with Spiking Neurons , 2016, The Journal of Neuroscience.
[45] Ha Hong,et al. Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.
[46] RuppinEytan,et al. Actor-critic models of the basal ganglia , 2002 .
[47] W. Newsome,et al. Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.
[48] Konrad P. Körding,et al. Toward an Integration of Deep Learning and Neuroscience , 2016, bioRxiv.
[49] Robert C. Wilson,et al. Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex , 2011, Nature Neuroscience.
[50] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.
[51] J. Hollerman,et al. Reward processing in primate orbitofrontal cortex and basal ganglia. , 2000, Cerebral cortex.
[52] Eytan Ruppin,et al. Actor-critic models of the basal ganglia: new anatomical and computational perspectives , 2002, Neural Networks.
[53] P. Dayan,et al. Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.
[54] H. Seung,et al. Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances. , 2007, Journal of neurophysiology.
[55] Xiao-Jing Wang,et al. Neural mechanism for stochastic behaviour during a competitive game , 2006, Neural Networks.
[56] W. Senn,et al. Reinforcement learning in populations of spiking neurons , 2008, Nature Neuroscience.
[57] L. F. Abbott,et al. Generating Coherent Patterns of Activity from Chaotic Neural Networks , 2009, Neuron.
[58] A. Koulakov,et al. Orbitofrontal Cortex Is Required for Optimal Waiting Based on Decision Confidence , 2014, Neuron.
[59] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[60] Thomas Miconi,et al. Biologically plausible learning in recurrent neural networks for flexible decision tasks , 2022 .
[61] W. Newsome,et al. Choosing the greater of two goods: neural currencies for valuation and decision making , 2005, Nature Reviews Neuroscience.
[62] Marisa Kellam,et al. Silencing Critics , 2016 .
[63] Gregor Hohpe,et al. Toward Integration , 2002 .
[64] H. Seo,et al. A reservoir of time constants for memory traces in cortical neurons , 2011, Nature Neuroscience.
[65] J. Gold,et al. The neural basis of decision making. , 2007, Annual review of neuroscience.
[66] W. Schultz. Midbrain Dopamine Neurons , 2009 .
[67] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[68] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[69] Timothy D. Hanks,et al. Bounded Integration in Parietal Cortex Underlies Decisions Even When Viewing Duration Is Dictated by the Environment , 2008, The Journal of Neuroscience.
[70] E. Rolls,et al. The Orbitofrontal Cortex , 2019 .
[71] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[72] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[73] Matthew T. Kaufman,et al. A neural network that finds a naturalistic solution for the production of muscle activity , 2015, Nature Neuroscience.
[74] Jan Peters,et al. Policy Gradient Methods , 2010, Encyclopedia of Machine Learning.
[75] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[76] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[77] T. Maia. Two-factor theory, the actor-critic model, and conditioned avoidance , 2010, Learning & behavior.
[78] Rajesh P. N. Rao. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes , 2010, Front. Comput. Neurosci..
[79] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.
[80] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[81] David Raposo,et al. Multisensory Decision-Making in Rats and Humans , 2012, The Journal of Neuroscience.
[82] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[83] Ila R Fiete,et al. Gradient learning in spiking neural networks by dynamic perturbation of conductances. , 2006, Physical review letters.
[84] Wulfram Gerstner,et al. Does computational neuroscience need new synaptic learning paradigms? , 2016, Current Opinion in Behavioral Sciences.
[85] David J. Freedman,et al. Choice-correlated activity fluctuations underlie learning of neuronal category representation , 2015, Nature Communications.
[86] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.
[87] E. Izhikevich. Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.
[88] Daniel Cownden,et al. Random feedback weights support learning in deep neural networks , 2014, ArXiv.
[89] M. Shadlen,et al. A role for neural integrators in perceptual decision making. , 2003, Cerebral cortex.
[90] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[91] Christopher D. Harvey,et al. Recurrent Network Models of Sequence Generation and Memory , 2016, Neuron.
[92] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[93] Dean V. Buonomano,et al. ROBUST TIMING AND MOTOR PATTERNS BY TAMING CHAOS IN RECURRENT NEURAL NETWORKS , 2012, Nature Neuroscience.
[94] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[95] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[96] Philipp Slusallek,et al. Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.
[97] Colin J. Akerman,et al. Random synaptic feedback weights support error backpropagation for deep learning , 2016, Nature Communications.
[98] Yoshua Bengio,et al. Towards a Biologically Plausible Backprop , 2016, ArXiv.
[99] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[100] Barbara Hammer,et al. Learning with recurrent neural networks , 2000 .
[101] P. Dayan,et al. Reward, Motivation, and Reinforcement Learning , 2002, Neuron.
[102] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines , 2015, ArXiv.
[103] G. Schoenbaum,et al. Does the orbitofrontal cortex signal value? , 2011, Annals of the New York Academy of Sciences.
[104] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[105] Xiao-Jing Wang. Decision Making in Recurrent Neuronal Circuits , 2008, Neuron.
[106] Christian K. Machens,et al. Behavioral / Systems / Cognitive Functional , But Not Anatomical , Separation of “ What ” and “ When ” in Prefrontal Cortex , 2009 .
[107] Xiao-Jing Wang,et al. Confidence estimation as a stochastic process in a neurodynamical system of decision making. , 2015, Journal of neurophysiology.
[108] M. Desmurget,et al. Basal ganglia contributions to motor control: a vigorous tutor , 2010, Current Opinion in Neurobiology.
[109] Konrad P. Kording,et al. Towards an integration of deep learning and neuroscience , 2016, bioRxiv.
[110] Guangyu R. Yang,et al. Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible Framework , 2016, PLoS Comput. Biol..
[111] Karl J. Friston,et al. Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.
[112] Matthew T. Kaufman,et al. A category-free neural population supports evolving demands during decision-making , 2014, Nature Neuroscience.
[113] L. Abbott,et al. From fixed points to chaos: Three models of delayed discrimination , 2013, Progress in Neurobiology.
[114] Ha Hong,et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.
[115] Hatim A. Zariwala,et al. Neural correlates, computation and behavioural impact of decision confidence , 2008, Nature.
[116] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[117] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[118] M. Shadlen,et al. Representation of Confidence Associated with a Decision by Neurons in the Parietal Cortex , 2009, Science.
[119] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.