论文信息 - Finding aircraft collision-avoidance strategies using policy search methods - 字舞流文

Finding aircraft collision-avoidance strategies using policy search methods

Leslie Pack Kaelbling | Tomas Lozano-Perez | L. Kaelbling | Tomas Lozano-Perez

[1] Yi Gu,et al. Space-indexed dynamic programming: learning to follow trajectories , 2008, ICML '08.

[2] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.

[3] Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.

[4] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .

[5] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.

[6] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.

[7] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.