A Prioritized objective actor-critic method for deep reinforcement learning
暂无分享,去创建一个
Saeid Nahavandi | Peter Vamplew | Richard Dazeley | Ngoc Duy Nguyen | Thanh Thi Nguyen | P. Vamplew | S. Nahavandi | R. Dazeley | T. Nguyen | Richard Dazeley
[1] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[2] Runzhe Yang,et al. A Generalized Algorithm for Multi-Objective RL and Policy Adaptation , 2019 .
[3] Csaba Szepesvári,et al. Multi-criteria Reinforcement Learning , 1998, ICML.
[4] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[5] John A. Bullinaria. Evolved age dependent plasticity improves neural network performance , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).
[6] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[7] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[8] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[9] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[10] Mikhail Pavlov,et al. Deep Attention Recurrent Q-Network , 2015, ArXiv.
[11] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[12] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[13] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[14] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[15] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[16] Carme Torras,et al. Safe robot execution in model-based reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[17] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[18] John Yearwood,et al. On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts , 2008, Australasian Conference on Artificial Intelligence.
[19] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[20] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[21] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[22] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[23] Martin A. Riedmiller,et al. Reinforcement learning for robot soccer , 2009, Auton. Robots.
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[25] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[27] Jun Tani,et al. Learning Multiple Goal-Directed Actions Through Self-Organization of a Dynamic Neural Network Model: A Humanoid Robot Experiment , 2008, Adapt. Behav..
[28] Junichi Murata,et al. Novelty-organizing team of classifiers in noisy and dynamic environments , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).
[29] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[30] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[31] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[32] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[33] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[34] Marco Laumanns,et al. Scalable multi-objective optimization test problems , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).
[35] Evan Dekker,et al. Empirical evaluation methods for multiobjective reinforcement learning algorithms , 2011, Machine Learning.