Three Architectures for Continuous Action

Three classifier system architectures are introduced that permit the systems to have continuous (non-discrete) actions. One is based on interpolation, the second on an actor-critic paradigm, and the third on treating the action as a continuous variable homogeneous with the input. While the last architecture appears most interesting and promising, all three offer potential directions toward continuous action, a goal that classifier systems have hardly addressed.

[1]  Stewart W. Wilson,et al.  Classifier Conditions based on Convex Hulls Pier Luca Lanzi , 2005 .

[2]  Stewart W. Wilson Function approximation with a classifier system , 2001 .

[3]  Martin V. Butz,et al.  An Algorithmic Description of XCS , 2000, IWLCS.

[4]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[5]  Martin V. Butz Kernel-based, ellipsoidal conditions in the real-valued XCS classifier system , 2005, GECCO '05.

[6]  Adel Torkaman Rahmani,et al.  An Evolutionary Function Approximation Approach to Compute Prediction in XCSF , 2005, ECML.

[7]  Stewart W. Wilson,et al.  Toward Optimal Classifier System Performance in Non-Markov Environments , 2000, Evolutionary Computation.

[8]  Daniele Loiacono,et al.  Generalization in the XCSF Classifier System: Analysis, Improvement, and Extension , 2007, Evolutionary Computation.

[9]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[10]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[11]  Stewart W. Wilson Classifier Systems for Continuous Payoff Environments , 2004, GECCO.

[12]  Martin V. Butz,et al.  An algorithmic description of XCS , 2000, Soft Comput..

[13]  Martin V. Butz,et al.  Toward a theory of generalization and learning in XCS , 2004, IEEE Transactions on Evolutionary Computation.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Daniele Loiacono,et al.  Extending XCSF beyond linear approximation , 2005, GECCO '05.