论文信息 - Exemplar-based direct policy search with evolutionary optimization

Exemplar-based direct policy search with evolutionary optimization

In this paper, an exemplar-based policy optimization framework for direct policy search is presented. In this exemplar-based approach, the policy to be optimized is composed of a set of exemplars and a case-based action selector. An implementation of this approach using a state-action-based policy representation and an evolutionary algorithm optimizer is shown to provide favorable search performance for two higher-dimensional problems.

Kokolo Ikeda

[1] Andrew G. Barto,et al. Robot Weightlifting By Direct Policy Search , 2001, IJCAI.

[2] John J. Grefenstette,et al. Learning Sequential Decision Rules Using Simulation Models and Competition , 1990, Machine Learning.

[3] Mark W. Spong,et al. Swing up control of the Acrobot , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[4] Darrell Whitley,et al. Genitor: a different genetic algorithm , 1988 .

[5] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[6] John J. Grefenstette,et al. Evolutionary Algorithms for Reinforcement Learning , 1999, J. Artif. Intell. Res..

[7] Steven Salzberg,et al. A Teaching Strategy for Memory-Based Control , 1997, Artificial Intelligence Review.

[8] Shigenobu Kobayashi,et al. Edge Assembly Crossover: A High-Power Genetic Algorithm for the Travelling Salesman Problem , 1997, ICGA.

[9] John J. Grefenstette,et al. Learning sequential decision rules using simulation models and competition , 2004, Machine Learning.

[10] Jan Telgen,et al. Stochastic Dynamic Programming , 2016 .