暂无分享,去创建一个
[1] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[2] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[3] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.
[4] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[5] Vicenç Gómez,et al. Dynamic Policy Programming with Function Approximation , 2011, AISTATS.
[6] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference (Extended Abstract) , 2013, IJCAI.
[7] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[8] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[9] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .
[10] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[11] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[12] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[13] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[14] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[15] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[16] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[17] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[18] Kavosh Asadi,et al. An Alternative Softmax Operator for Reinforcement Learning , 2016, ICML.