Limits in Long Path Learning with XCS

The development of the XCS Learning Classifier System [26] has produced a stable implementation, able to consistently identify the accurate and optimally general population of classifiers mapping a given reward landscape [15,16,29]. XCS is particularly powerful within direct-reward environments, and notably within problems suitable for commercial application [3]. The application of XCS within delayed reward environments has also shown promise, although early investigations were within enviroments with a comparatively short delay to reward (e.g. [28, 19]). Subsequent systematic investigation [19,20,1,2] have suggested that XCS has difficulty accurately mapping and exploiting even simple environments with moderate reward delays. This paper summarises these results and presents new results that identify some limits and their implications. A modification to the error computation within XCS is introduced that allows the minimum error parameter to be applied relative to the magnitude of the payoff to each classifier. First results demonstrate that this modification enables XCS to successfully map longer delayed-reward enviroments.

[1]  Larry Bull,et al.  On accuracy-based fitness , 2002, Soft Comput..

[2]  John H. Holmes,et al.  A Genetics-Based Machine Learning Approach to Knowledge Discovery in Clinical Data. , 1996 .

[3]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[4]  Tim Kovacs,et al.  Advances in Learning Classifier Systems , 2001, Lecture Notes in Computer Science.

[5]  Stewart W. Wilson Mining Oblique Data with XCS , 2000, IWLCS.

[6]  Pier Luca Lanzi,et al.  An Analysis of Generalization in the XCS Classifier System , 1999, Evolutionary Computation.

[7]  Rick L. Riolo,et al.  The Emergence of Coupled Sequences of Classifiers , 1989, ICGA.

[8]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[9]  Larry Bull,et al.  On using ZCS in a Simulated Continuous Double-Auction Market , 1999, GECCO.

[10]  Robert E. Smith,et al.  Classifier systems in combat: two-sided learning of maneuvers for advanced fighter aircraft , 2000 .

[11]  Pier Luca Lanzi,et al.  A Roadmap to the Last Decade of Learning Classifier System Research , 1999, Learning Classifier Systems.

[12]  Larry Bull,et al.  Self-adaptive mutation in classifier system controllers , 2000 .

[13]  Alexandre Parodi,et al.  The animat and the physician , 1991 .

[14]  er SystemsTim KovacsOctober Evolving Optimal Populations with XCS Classi , 1996 .

[15]  John H. Holland,et al.  Induction: Processes of Inference, Learning, and Discovery , 1987, IEEE Expert.

[16]  Zbigniew Michalewicz,et al.  Evolutionary Computation 2 , 2000 .

[17]  Stewart W. Wilson,et al.  An Incremental Multiplexer Problem and Its Uses in Classifier System Research , 2001, IWLCS.

[18]  Dave Cliff,et al.  Adding Temporary Memory to ZCS , 1994, Adapt. Behav..

[19]  Pier Luca Lanzi Generalization in Wilson's Classifier System , 1998, PPSN.

[20]  Alwyn Barry,et al.  The stability of long action chains in XCS , 2002, Soft Comput..

[21]  Pier Luca Lanzi,et al.  A Study of the Generalization Capabilities of XCS , 1997, ICGA.

[22]  Stewart W. Wilson Generalization in the XCS Classifier System , 1998 .

[23]  Marco Colombetti,et al.  Robot Shaping: An Experiment in Behavior Engineering , 1997 .

[24]  Larry Bull,et al.  Learning Classifier Systems , 2002, Annual Conference on Genetic and Evolutionary Computation.

[25]  Daniele Montanari,et al.  Learning and bucket brigade dynamics in classifier systems , 1990 .

[26]  Martin V. Butz,et al.  An Algorithmic Description of XCS , 2000, IWLCS.

[27]  Stewart W. Wilson Generalization in Evolutionary Learning , 1997 .