Deep Reinforcement Learning With Modulated Hebbian Plus Q-Network Architecture
暂无分享,去创建一个
Jeffrey L. Krichmar | Soheil Kolouri | Andrea Soltoggio | Praveen K. Pilly | Pawel Ladosz | Nicholas Ketz | Eseoghene Ben-Iwhiwhu | Nicholas A. Ketz | Yang Hu | A. Soltoggio | J. Krichmar | S. Kolouri | E. Ben-Iwhiwhu | Jeffery Dick | Pawel Ladosz | Andrea Soltoggio | Soheil Kolouri | Eseoghene Ben-Iwhiwhu
[1] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[2] Sergey Levine,et al. Learning deep neural network policies with continuous memory states , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Moshe Dor,et al. אבן, and: Stone , 2017 .
[5] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[6] Kenneth O. Stanley,et al. Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity , 2018, ICLR.
[7] H. Markram,et al. Regulation of Synaptic Efficacy by Coincidence of Postsynaptic APs and EPSPs , 1997, Science.
[8] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.
[9] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[10] Kenneth O. Stanley,et al. Differentiable plasticity: training plastic neural networks with backpropagation , 2018, ICML.
[11] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[12] P. Alam. ‘T’ , 2021, Composites Engineering: An A–Z Guide.
[13] Ruslan Salakhutdinov,et al. Neural Map: Structured Memory for Deep Reinforcement Learning , 2017, ICLR.
[14] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[15] Jochen J. Steil,et al. Rare Neural Correlations Implement Robotic Conditioning with Delayed Rewards and Disturbances , 2013, Front. Neurorobot..
[16] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[17] Ngo Anh Vien,et al. A Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes , 2018, IEEE Access.
[18] Jochen J. Steil,et al. Learning the rules of a game: Neural conditioning in human-robot interaction with delayed rewards , 2013, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[19] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[20] P. Alam. ‘E’ , 2021, Composites Engineering: An A–Z Guide.
[21] Kenneth O. Stanley,et al. From modulated Hebbian plasticity to simple behavior learning through noise and weight saturation , 2012, Neural Networks.
[22] Razvan Pascanu,et al. Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery , 2018, ArXiv.
[23] Sam Devlin,et al. AMRL: Aggregated Memory For Reinforcement Learning , 2020, ICLR.
[24] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[25] Guangwen Yang,et al. Episodic Memory Deep Q-Networks , 2018, IJCAI.
[26] Thomas de Quincey. [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.
[27] David Pfau,et al. Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[28] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[29] P. Alam. ‘A’ , 2021, Composites Engineering: An A–Z Guide.
[30] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[31] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[32] E. Izhikevich. Solving the distal reward problem through linkage of STDP and dopamine signaling , 2007, BMC Neuroscience.
[33] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[34] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.
[35] Peter Vrancx,et al. Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets , 2017, AAAI.
[36] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[37] Carl E. Rasmussen,et al. Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs , 2017, NIPS.
[38] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[39] Pascal Poupart,et al. On Improving Deep Reinforcement Learning for POMDPs , 2017, ArXiv.
[40] P. Alam. ‘K’ , 2021, Composites Engineering.
[41] Tsuyoshi Murata,et al. {m , 1934, ACML.
[42] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[43] Shie Mannor,et al. Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning , 2018, NeurIPS.
[44] Kamyar Azizzadenesheli,et al. Experimental results : Reinforcement Learning of POMDPs using Spectral Methods , 2017, ArXiv.
[45] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[46] Jochen J. Steil,et al. Solving the Distal Reward Problem with Rare Correlations , 2013, Neural Computation.
[47] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.
[48] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[49] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.
[50] Danna Zhou,et al. d. , 1840, Microbial pathogenesis.
[51] Kenneth D. Miller,et al. The Role of Constraints in Hebbian Learning , 1994, Neural Computation.
[52] J. Knott. The organization of behavior: A neuropsychological theory , 1951 .
[53] R. Kempter,et al. Hebbian learning and spiking neurons , 1999 .