论文信息 - Variable Metric Reinforcement Learning Methods Applied to the Noisy Mountain Car Problem

Variable Metric Reinforcement Learning Methods Applied to the Noisy Mountain Car Problem

Two variable metric reinforcement learning methods, the natural actor-critic algorithm and the covariance matrix adaptation evolution strategy, are compared on a conceptual level and analysed experimentally on the mountain car benchmark task with and without noise.

Christian Igel | Verena Heidrich-Meisner | C. Igel | V. Heidrich-Meisner

[1] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[2] Christian Igel,et al. Similarities and differences between policy gradient methods and evolution strategies , 2008, ESANN.

[3] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[4] Petros Koumoutsakos,et al. Learning Probability Distributions in Continuous Evolutionary Algorithms - a Comparative Review , 2004, Nat. Comput..

[5] Martin A. Riedmiller,et al. Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[6] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[7] Risto Miikkulainen,et al. Efficient Non-linear Control Through Neuroevolution , 2006, ECML.

[8] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[9] Stefan Schaal,et al. Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning , 2007, ESANN.

[10] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[11] Hans-Paul Schwefel,et al. Evolution and Optimum Seeking: The Sixth Generation , 1993 .

[12] Hans-Paul Schwefel,et al. Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[13] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15] Christian Igel,et al. Neuroevolution for reinforcement learning using evolution strategies , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[16] Gregor Schöner,et al. Making Driver Modeling Attractive , 2005, IEEE Intell. Syst..

[17] Xin Yao,et al. Fast Evolution Strategies , 1997, Evolutionary Programming.

[18] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[19] Pedro Larrañaga,et al. Towards a New Evolutionary Computation - Advances in the Estimation of Distribution Algorithms , 2006, Towards a New Evolutionary Computation.

[20] W. Vent,et al. Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[21] Gerald Sommer,et al. Efficient reinforcement learning through Evolutionary Acquisition of Neural Topologies , 2005, ESANN.

[22] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.

[23] Hans-Paul Schwefel,et al. Evolution and optimum seeking , 1995, Sixth-generation computer technology series.

[24] Petros Koumoutsakos,et al. A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion , 2009, IEEE Transactions on Evolutionary Computation.

[25] Gerald Sommer,et al. Evolutionary reinforcement learning of artificial neural networks , 2007, Int. J. Hybrid Intell. Syst..

[26] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[27] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .

[28] Neil D. Lawrence,et al. Missing Data in Kernel PCA , 2006, ECML.