Reward-based training of recurrent neural networks for cognitive and value-based tasks
暂无分享,去创建一个
[1] K. Koketsu,et al. Cholinergic and inhibitory synapses in a pathway from motor‐axon collaterals to motoneurones , 1954, The Journal of physiology.
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[4] Richard A. Andersen,et al. A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons , 1988, Nature.
[5] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[6] Dhanistha Panyasak,et al. Circuits , 1995, Annals of the New York Academy of Sciences.
[7] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[8] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[9] R. Romo,et al. Neuronal correlates of parametric working memory in the prefrontal cortex , 1999, Nature.
[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[11] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[12] J. Hollerman,et al. Reward processing in primate orbitofrontal cortex and basal ganglia. , 2000, Cerebral cortex.
[13] Xiao-Jing Wang,et al. Probabilistic Decision Making by Slow Reverberation in Cortical Circuits , 2002, Neuron.
[14] M. Shadlen,et al. Response of Neurons in the Lateral Intraparietal Area during a Combined Visual Discrimination Reaction Time Task , 2002, The Journal of Neuroscience.
[15] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.
[16] M. Shadlen,et al. A role for neural integrators in perceptual decision making. , 2003, Cerebral cortex.
[17] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] P. Glimcher,et al. Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.
[20] W. Newsome,et al. Choosing the greater of two goods: neural currencies for valuation and decision making , 2005, Nature Reviews Neuroscience.
[21] Ila R Fiete,et al. Gradient learning in spiking neural networks by dynamic perturbation of conductances. , 2006, Physical review letters.
[22] C. Padoa-Schioppa,et al. Neurons in the orbitofrontal cortex encode economic value , 2006, Nature.
[23] Xiao-Jing Wang,et al. A Recurrent Network Mechanism of Time Integration in Perceptual Decisions , 2006, The Journal of Neuroscience.
[24] Xiao-Jing Wang,et al. Neural mechanism for stochastic behaviour during a competitive game , 2006, Neural Networks.
[25] M. Frank,et al. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. , 2006, Psychological review.
[26] J. Wallis. Orbitofrontal cortex and its contribution to decision-making. , 2007, Annual review of neuroscience.
[27] J. Gold,et al. The neural basis of decision making. , 2007, Annual review of neuroscience.
[28] E. Izhikevich. Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.
[29] H. Seung,et al. Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances. , 2007, Journal of neurophysiology.
[30] Timothy D. Hanks,et al. Bounded Integration in Parietal Cortex Underlies Decisions Even When Viewing Duration Is Dictated by the Environment , 2008, The Journal of Neuroscience.
[31] Hatim A. Zariwala,et al. Neural correlates, computation and behavioural impact of decision confidence , 2008, Nature.
[32] P. Dayan,et al. Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.
[33] Jonathan D. Cohen,et al. Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement , 2008, NIPS.
[34] Xiao-Jing Wang. Decision Making in Recurrent Neuronal Circuits , 2008, Neuron.
[35] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[36] M. Shadlen,et al. Representation of Confidence Associated with a Decision by Neurons in the Parietal Cortex , 2009, Science.
[37] L. F. Abbott,et al. Generating Coherent Patterns of Activity from Chaotic Neural Networks , 2009, Neuron.
[38] W. Senn,et al. Reinforcement learning in populations of spiking neurons , 2008, Nature Neuroscience.
[39] Rajesh P. N. Rao. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes , 2010, Front. Comput. Neurosci..
[40] Xiao-Jing Wang,et al. Internal Representation of Task Rules by Recurrent Dynamics: The Importance of the Diversity of Neural Responses , 2010, Front. Comput. Neurosci..
[41] Henning Sprekeler,et al. Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity , 2010, The Journal of Neuroscience.
[42] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.
[43] Xiao-Jing Wang,et al. Synaptic computation underlying probabilistic inference , 2010, Nature Neuroscience.
[44] M. Desmurget,et al. Basal ganglia contributions to motor control: a vigorous tutor , 2010, Current Opinion in Neurobiology.
[45] Christian K. Machens,et al. Behavioral / Systems / Cognitive Functional , But Not Anatomical , Separation of “ What ” and “ When ” in Prefrontal Cortex , 2009 .
[46] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[47] G. Schoenbaum,et al. Does the orbitofrontal cortex signal value? , 2011, Annals of the New York Academy of Sciences.
[48] H. Seo,et al. A reservoir of time constants for memory traces in cortical neurons , 2011, Nature Neuroscience.
[49] Robert C. Wilson,et al. Expectancy-related changes in firing of dopamine neurons depend on orbitofrontal cortex , 2011, Nature Neuroscience.
[50] N. Daw,et al. Signals in Human Striatum Are Appropriate for Policy Update Rather than Value Prediction , 2011, The Journal of Neuroscience.
[51] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[52] D. Buonomano,et al. Complexity without chaos: Plasticity within random recurrent networks generates robust timing and motor control , 2012, 1210.2104.
[53] David Raposo,et al. Multisensory Decision-Making in Rats and Humans , 2012, The Journal of Neuroscience.
[54] David Sussillo,et al. Opening the Black Box: Low-Dimensional Dynamics in High-Dimensional Recurrent Neural Networks , 2013, Neural Computation.
[55] L. Abbott,et al. From fixed points to chaos: Three models of delayed discrimination , 2013, Progress in Neurobiology.
[56] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[57] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[58] W. Newsome,et al. Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.
[59] Xiao-Jing Wang,et al. The importance of mixed selectivity in complex cognitive tasks , 2013, Nature.
[60] Dean V. Buonomano,et al. ROBUST TIMING AND MOTOR PATTERNS BY TAMING CHAOS IN RECURRENT NEURAL NETWORKS , 2012, Nature Neuroscience.
[61] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[62] Ha Hong,et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex , 2014, Proceedings of the National Academy of Sciences.
[63] Matthew T. Kaufman,et al. A category-free neural population supports evolving demands during decision-making , 2014, Nature Neuroscience.
[64] O. Hikosaka,et al. Basal ganglia circuits for reward value-guided behavior. , 2014, Annual review of neuroscience.
[65] David Sussillo,et al. Neural circuits as computational dynamical systems , 2014, Current Opinion in Neurobiology.
[66] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.
[67] Daniel L. K. Yamins,et al. Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition , 2014, PLoS Comput. Biol..
[68] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[69] Daniel Cownden,et al. Random feedback weights support learning in deep neural networks , 2014, ArXiv.
[70] A. Koulakov,et al. Orbitofrontal Cortex Is Required for Optimal Waiting Based on Decision Confidence , 2014, Neuron.
[71] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[72] W. Gerstner,et al. Optimal Control of Transient Dynamics in Balanced Networks Supports Generation of Complex Movements , 2014, Neuron.
[73] Wolfgang Maass,et al. Emergence of complex computational structures from chaotic neural networks through reward-modulated Hebbian learning. , 2014, Cerebral cortex.
[74] Pieter R. Roelfsema,et al. Reinforcement Learning of Linking and Tracing Contours in Recurrent Neural Networks , 2015, PLoS Comput. Biol..
[75] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[76] Xiao-Jing Wang,et al. Confidence estimation as a stochastic process in a neurodynamical system of decision making. , 2015, Journal of neurophysiology.
[77] G. Schoenbaum,et al. What the orbitofrontal cortex does not do , 2015, Nature Neuroscience.
[78] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[79] N. Parga,et al. Dynamic Control of Response Criterion in Premotor Cortex during Perceptual Detection under Temporal Uncertainty , 2015, Neuron.
[80] David J. Freedman,et al. Choice-correlated activity fluctuations underlie learning of neuronal category representation , 2015, Nature Communications.
[81] Matthew T. Kaufman,et al. A neural network that finds a naturalistic solution for the production of muscle activity , 2015, Nature Neuroscience.
[82] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines , 2015, ArXiv.
[83] Surya Ganguli,et al. On simplicity and complexity in the brave new world of large-scale neuroscience , 2015, Current Opinion in Neurobiology.
[84] Máté Lengyel,et al. Goal-Directed Decision Making with Spiking Neurons , 2016, The Journal of Neuroscience.
[85] Ha Hong,et al. Explicit information for category-orthogonal object properties increases along the ventral stream , 2016, Nature Neuroscience.
[86] Konrad P. Kording,et al. Towards an integration of deep learning and neuroscience , 2016, bioRxiv.
[87] Guangyu R. Yang,et al. Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible Framework , 2016, PLoS Comput. Biol..
[88] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[89] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[90] Christopher D. Harvey,et al. Recurrent Network Models of Sequence Generation and Memory , 2016, Neuron.
[91] Thomas Miconi,et al. Biologically plausible learning in recurrent neural networks for flexible decision tasks , 2022 .
[92] Yael Niv,et al. Reinforcement learning with Marr , 2016, Current Opinion in Behavioral Sciences.
[93] Francesca Mastrogiuseppe,et al. Intrinsically-generated fluctuating activity in excitatory-inhibitory networks , 2016, PLoS Comput. Biol..