Approximate Policy-Based Accelerated Deep Reinforcement Learning
暂无分享,去创建一个
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Chaitali Chakrabarti,et al. A Deep Q-Learning Approach for Dynamic Management of Heterogeneous Processors , 2019, IEEE Computer Architecture Letters.
[3] Guohui Tian,et al. A method for knowledge construction from natural language based on reinforcement learning , 2017, 2017 29th Chinese Control And Decision Conference (CCDC).
[4] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[5] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[6] Qiang Yu,et al. Multisource Transfer Double DQN Based on Actor Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[7] Jun Tan,et al. Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[8] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[9] Wei Hu,et al. Exploring Deep Reinforcement Learning with Multi Q-Learning , 2016 .
[10] Pieter Abbeel,et al. Accelerated Methods for Deep Reinforcement Learning , 2018, ArXiv.
[11] Dongbin Zhao,et al. Deep Reinforcement Learning With Visual Attention for Vehicle Classification , 2017, IEEE Transactions on Cognitive and Developmental Systems.
[12] Stephen Tyree,et al. GA3C: GPU-based A3C for Deep Reinforcement Learning , 2016, ArXiv.
[13] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[14] Santiago Ontañón,et al. High-Level Representations for Game-Tree Search in RTS Games , 2014, Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.
[15] Marc G. Bellemare,et al. Increasing the Action Gap: New Operators for Reinforcement Learning , 2015, AAAI.
[16] Dongbin Zhao,et al. Cooperative reinforcement learning for multiple units combat in starCraft , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).
[17] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[18] Tingwen Huang,et al. Model-Free Optimal Tracking Control via Critic-Only Q-Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[19] Hilbert J. Kappen,et al. Speedy Q-Learning , 2011, NIPS.
[20] Yunpeng Pan,et al. Efficient Reinforcement Learning via Probabilistic Trajectory Optimization , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[21] Yi Zhang,et al. Human-like Autonomous Vehicle Speed Control by Deep Reinforcement Learning with Double Q-Learning , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).
[22] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[23] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[24] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[25] Daoyi Dong,et al. Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[26] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[27] Qiang Liu,et al. Learning to Explore with Meta-Policy Gradient , 2018, ICML 2018.
[28] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[29] Xiaofang Zhang,et al. Averaged-A3C for Asynchronous Deep Reinforcement Learning , 2018, ICONIP.
[30] Shiji Song,et al. Plume Tracing via Model-Free Reinforcement Learning Method , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[31] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[32] Zhen Ni,et al. A Multistage Game in Smart Grid Security: A Reinforcement Learning Solution , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[33] Frank L. Lewis,et al. Off-Policy Interleaved $Q$ -Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[34] Rui Wang,et al. Multi-critic DDPG Method and Double Experience Replay , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[35] Wei Xing Zheng,et al. Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[36] Atsushi Ike,et al. GUNREAL: GPU-accelerated UNsupervised REinforcement and Auxiliary Learning , 2017, 2017 Fifth International Symposium on Computing and Networking (CANDAR).
[37] Nahum Shimkin,et al. Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning , 2016, ICML.
[38] Akanksha Rai Sharma,et al. Literature survey of statistical, deep and reinforcement learning in natural language processing , 2017, 2017 International Conference on Computing, Communication and Automation (ICCCA).
[39] Csaba Szepesvári,et al. Error Propagation for Approximate Policy and Value Iteration , 2010, NIPS.
[40] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[41] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.