暂无分享,去创建一个
Kamyar Azizzadenesheli | Anima Anandkumar | Manish Kumar Bera | K. Azizzadenesheli | Anima Anandkumar | Manish Kumar Bera
[1] Nan Jiang,et al. The Dependence of Effective Planning Horizon on Model Accuracy , 2015, AAMAS.
[2] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] David Barber,et al. On the Computational Complexity of Stochastic Controller Optimization in POMDPs , 2011, TOCT.
[5] Zachary Chase Lipton,et al. Combating Deep Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, 1611.01211.
[6] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[7] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[8] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[9] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[10] Kamyar Azizzadenesheli,et al. Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies , 2016, COLT.
[11] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[12] John M. Lee. Riemannian Manifolds: An Introduction to Curvature , 1997 .
[13] Kamyar Azizzadenesheli,et al. signSGD: compressed optimisation for non-convex problems , 2018, ICML.
[14] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[15] Kamyar Azizzadenesheli,et al. Experimental results : Reinforcement Learning of POMDPs using Spectral Methods , 2017, ArXiv.
[16] F. Opitz. Information geometry and its applications , 2012, 2012 9th European Radar Conference.
[17] Jianfeng Gao,et al. Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, ArXiv.
[18] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[20] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[21] Kamyar Azizzadenesheli,et al. Reinforcement Learning of POMDPs using Spectral Methods , 2016, COLT.
[22] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[23] Keyan Zahedi,et al. Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes , 2015, ArXiv.
[24] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.