Improved anticipatory classifier system with internal memory for POMDPs with aliased states

Abstract ACSM (Hayashida et al., 2014) consists of a method of discerning the aliased states in a POMDP (Partially Observable Markov Decision Process) which is one of Markov decision process such that an agent observes local information about the environment, and choosing the appropriate action based on the internal memory and the sensory information which an agent obtains from the environment. Though ACSM achieves the highest performance in the existing methods based on classifier systems, it requires a huge number of memories for the internal memories, and spends long time for some large scaled problems. This paper improves a classifier system, ACSM (Anticipatory Classifier System with Memory) focused on the process of learning of ACSM, and the aim of this paper is to make the system more efficient. The improved method is named ACSMr in this paper, and some numerical experiments using 5 kinds of maze problems which are well used as benchmark problems for POMDPs are executed. ACSMr achieves greater experimental result than the existing classifier systems for the maze problems.

[1]  Wolfgang Stolzmann,et al.  An Introduction to Anticipatory Classifier Systems , 1999, Learning Classifier Systems.

[2]  Sean Saxon,et al.  XCS and the Monk's Problems , 1999, Learning Classifier Systems.

[3]  Stewart W. Wilson,et al.  Toward Optimal Classifier System Performance in Non-Markov Environments , 2000, Evolutionary Computation.

[4]  Kenneth DeJong,et al.  Learning with genetic algorithms: An overview , 1988, Machine Learning.

[5]  John H. Holmes,et al.  Rule Discovery in Epidemiologic Surveillance Data Using EpiXCS: An Evolutionary Computation Approach , 2005, AIME.

[6]  Pier Luca Lanzi An Analysis of the Memory Mechanism of XCSM , 2007 .

[7]  Martin V. Butz,et al.  An algorithmic description of XCS , 2000, Soft Comput..

[8]  Adel Torkaman Rahmani,et al.  A Recursive Classifier System for Partially Observable Environments , 2009, Fundam. Informaticae.

[9]  A. Martin V. Butz,et al.  The anticipatory classifier system and genetic generalization , 2002, Natural Computing.

[10]  Stewart W. Wilson Mining Oblique Data with XCS , 2000, IWLCS.

[11]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[12]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[13]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[14]  P. Lanzi,et al.  Adaptive Agents with Reinforcement Learning and Internal Memory , 2000 .

[15]  Tomohiro Hayashida,et al.  Aliased States Discerning in POMDPs and Improved Anticipatory Classifier System , 2014, KES.

[16]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.