暂无分享,去创建一个
[1] H. Robbins. A Stochastic Approximation Method , 1951 .
[2] Eric Wiewiora,et al. Potential-Based Shaping and Q-Value Initialization are Equivalent , 2003, J. Artif. Intell. Res..
[3] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] DarrellTrevor,et al. End-to-end training of deep visuomotor policies , 2016 .
[6] Andrew G. Barto,et al. Autonomous shaping: knowledge transfer in reinforcement learning , 2006, ICML.
[7] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[8] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[9] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[11] Doina Precup,et al. A Convergent Form of Approximate Policy Iteration , 2002, NIPS.
[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.
[15] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[16] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[17] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[18] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[19] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[20] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[21] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[22] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.