Finding aircraft collision-avoidance strategies using policy search methods
暂无分享,去创建一个
[1] Yi Gu,et al. Space-indexed dynamic programming: learning to follow trajectories , 2008, ICML '08.
[2] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[3] Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.
[4] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[5] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[6] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[7] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.