暂无分享,去创建一个
Zhiyong Chen | Nasimul Noman | Mohsen Zamani | Chayan Banerjee | Zhiyong Chen | M. Zamani | N. Noman | C. Banerjee
[1] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[2] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[3] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[4] Jian Peng,et al. Policy Optimization by Genetic Distillation , 2017, ICLR.
[5] Sergey Levine,et al. Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.
[6] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[7] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[8] Kagan Tumer,et al. Evolution-Guided Policy Gradient in Reinforcement Learning , 2018, NeurIPS.
[9] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[10] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[11] Kenneth O. Stanley,et al. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.
[12] Derong Liu,et al. Adaptive $Q$ -Learning for Data-Based Optimal Output Regulation With Experience Replay , 2018, IEEE Transactions on Cybernetics.
[13] Adarsh Sehgal,et al. Deep Reinforcement Learning Using Genetic Algorithm for Parameter Optimization , 2019, 2019 Third IEEE International Conference on Robotic Computing (IRC).
[14] S. Srihari. Mixture Density Networks , 1994 .
[15] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[16] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[17] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[18] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[19] Sergey Levine,et al. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.
[20] Mengjie Zhang,et al. Evolving Deep Convolutional Neural Networks for Image Classification , 2017, IEEE Transactions on Evolutionary Computation.
[21] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[22] Alberto Rodriguez,et al. Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[23] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[24] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[25] Shiliang Sun,et al. A Survey of Optimization Methods From a Machine Learning Perspective , 2019, IEEE Transactions on Cybernetics.
[26] Kagan Tumer,et al. Collaborative Evolutionary Reinforcement Learning , 2019, ICML.
[27] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[28] E. Purcell. Life at Low Reynolds Number , 2008 .
[29] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[30] Sergey Levine,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[31] Atil Iscen,et al. Data Efficient Reinforcement Learning for Legged Robots , 2019, CoRL.
[32] Limeng Cui,et al. GADAM: Genetic-Evolutionary ADAM for Deep Neural Network Optimization , 2018, ArXiv.
[33] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[34] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[35] Frank L. Lewis,et al. Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances , 2016, IEEE Transactions on Cybernetics.
[36] Natasha Jaques,et al. Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog , 2019, ArXiv.
[37] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[38] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[39] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[40] Sergey Levine,et al. Scalable Multi-Task Imitation Learning with Autonomous Improvement , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).