Analysis of Reward Functions in Deep Reinforcement Learning for Continuous State Space Control
暂无分享,去创建一个
[1] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[2] Tsuyoshi Murata,et al. {m , 1934, ACML.
[3] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[4] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.
[5] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[6] Jan M. Maciejowski,et al. Predictive control : with constraints , 2002 .
[7] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[8] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[9] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .
[10] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[11] Shimon Whiteson,et al. OFFER: Off-Environment Reinforcement Learning , 2017, AAAI.
[12] Lex Weaver,et al. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.
[13] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[14] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[15] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[16] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[17] Danna Zhou,et al. d. , 1934, Microbial pathogenesis.
[18] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[19] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[20] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[21] Alberto Bemporad,et al. Predictive Control for Linear and Hybrid Systems , 2017 .
[22] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.