The Essential Dynamics Algorithm: Fast Policy Search In Continuous Worlds
暂无分享,去创建一个
[1] Jr. Donald P. Gaver. Statistical methods for improving simulation efficiency , 1969 .
[2] George S. Fishman,et al. Solution of Large Networks by Matrix Methods , 1976, IEEE Transactions on Systems, Man, and Cybernetics.
[3] R. Rubinstein,et al. On the optimality and e ciency of common random numbers , 1984 .
[4] William H. Press,et al. Numerical Recipes: The Art of Scientific Computing , 1987 .
[5] Karl Johan Åström,et al. Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.
[6] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[7] Marco Colombetti,et al. Training Agents to Perform Sequential Behavior , 1994, Adapt. Behav..
[8] B. Pasik-Duncan,et al. Adaptive Control , 1996, IEEE Control Systems.
[9] Ruth F. Curtain,et al. Linear-quadratic control: An introduction , 1997, Autom..
[10] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[11] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[12] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[13] Jette Randløv,et al. Shaping in Reinforcement Learning by Changing the Physics of the Problem , 2000, ICML.
[14] Andrew W. Moore,et al. Policy Search using Paired Comparisons , 2003, J. Mach. Learn. Res..
[15] Sebastian Thrun,et al. Motion planning through policy search , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Martin C. Martin,et al. The Essential Dynamics Algorithm: Essential Results , 2003 .
[17] Pat Langley,et al. Editorial: On Machine Learning , 1986, Machine Learning.
[18] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.
[19] Martin C. Martin. Controlling Cardea: Fast Policy Search in a High Dimensional Space , 2004 .