Solving Problems in Partially Observable Environments with Classiier Systems (experiments on Adding Memory to Xcs) Solving Problems in Partially Observable Environments with Classiier Systems (experiments on Adding Memory to Xcs)

XCS is a classi er system recently introduced by Wilson that differs from Holland's framework in that classi er tness is based on the accuracy of the prediction instead of the prediction itself. According to the original proposal, XCS has no internal message list as traditional classi er systems does; hence XCS learns only reactive input/output mappings that are optimal in Markovian environments. When the environment is partially observable, i.e. non-Markovian, XCS evolves suboptimal solutions; in order to evolve an optimal policy in such environments the system needs some sort of internal memory mechanism. In this paper, we add internal memory mechanism to the XCS classi er system. We then test XCS with internal memory, named XCSM, in non-Markovian environments of increasing di culty. Experimental results, we present, show that XCSM is able to evolve optimal solutions in simple environments, while in more complex problems the system needs special operators or special exploration strategies. We show also that the performance of XCSM is very stable with respect to the size of the internal memory involved in learning. Accordingly, when complex non-Markovian environments are faced XCSM performance results to be more stable when more bits than necessary are employed. Finally, we extend some of the results presented in the literature for classi er system in non-Markovian problems, applying XCSM to environments which require the agent to perform sequences of actions in the internal memory. The results presented suggest that the exploration strategies currently employed in the study of XCS are too simple to be employed with XCSM; accordingly, other exploration strategies should be investigated in order to develop better classi er systems