论文信息 - XCS with Computable Prediction in Multistep Environments

XCS with Computable Prediction in Multistep Environments

XCSF extends the typical concept of learning classifier systems through the introduction of computable classifier prediction. Initial results show that XCSF’s computable prediction can be used to evolve accurate piecewise linear approximations of simple functions. In this paper, we take XCSF one step further and apply it to typical reinforcement learning problems involving delayed rewards. In essence, we use XCSF as a method of generalized (linear) reinforcement learning to evolve piecewise linear approximations of the payoff surfaces of typical multistep problems. Our results show that XCSF can easily evolve optimal and near optimal solutions for problems introduced in the literature to test linear reinforcement learning methods.

Stewart W. Wilson | D. Goldberg | D. Loiacono | P. Lanzi

[1] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[2] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[3] Stewart W. Wilson. Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[4] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[5] Sebastian Thrun,et al. Issues in Using Function Approximation for Reinforcement Learning , 1999 .

[6] M. Colombetti,et al. An extension to the XCS classifier system for stochastic environments , 1999 .

[7] Martin V. Butz,et al. An algorithmic description of XCS , 2000, Soft Comput..

[8] Doina Precup,et al. A Convergent Form of Approximate Policy Iteration , 2002, NIPS.

[9] Stewart W. Wilson. Classifier Systems for Continuous Payoff Environments , 2004, GECCO.

[10] Stewart W. Wilson. Classifiers that approximate functions , 2002, Natural Computing.

[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.