Improving MACS Thanks to a Comparison with 2TBNs

Factored Markov Decision Processes is the theoretical framework underlying multi-step Learning Classifier Systems research. This framework is mostly used in the context of Two-stage Bayes Networks, a subset of Bayes Networks. In this paper, we compare the Learning Classifier Systems approach and the Bayes Networks approach to factored Markov Decision Problems. More specifically, we focus on a comparison between MACS, an Anticipatory Learning Classifier System, and Structured Policy Iteration, a general planning algorithm used in the context of Two-stage Bayes Networks. From that comparison, we define a new algorithm resulting from the adaptation of Structured Policy Iteration to the context of MACS. We conclude by calling for a closer communication between both research communities.

[1]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[2]  Stewart W. Wilson,et al.  Learning Classifier Systems, From Foundations to Applications , 2000 .

[3]  Ronald A. Howard,et al.  Dynamic Probabilistic Systems , 1971 .

[4]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[5]  Zbigniew Michalewicz,et al.  Evolutionary Computation 2 , 2000 .

[6]  Kevin P. Murphy,et al.  Learning the Structure of Dynamic Probabilistic Networks , 1998, UAI.

[7]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[8]  Martin V. Butz,et al.  An Algorithmic Description of ACS2 , 2001, International Workshop on Learning Classifier Systems.

[9]  Wolfgang Stolzmann,et al.  Anticipatory Classifier Systems: An introduction , 2001 .

[10]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[11]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[12]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[13]  Olivier Sigaud,et al.  YACS: Combining Dynamic Programming with Generalization in Classifier Systems , 2000, IWLCS.

[14]  Martin V. Butz,et al.  Introducing a Genetic Generalization Pressure to the Anticipatory Classifier System - Part 1: Theoretical approach , 2000, GECCO.

[15]  Olivier Sigaud,et al.  Combining latent learning with dynamic programming in the modular anticipatory classifier system , 2005, Eur. J. Oper. Res..

[16]  Olivier Sigaud,et al.  Designing Efficient Exploration with MACS: Modules and Function Approximation , 2003, GECCO.

[17]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[18]  Zoubin Ghahramani,et al.  Learning Dynamic Bayesian Networks , 1997, Summer School on Neural Networks.

[19]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[20]  M. Puterman,et al.  Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .

[21]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[22]  Olivier Sigaud,et al.  Combining Anticipation and Dynamic Programming in Classifier Systems , 2000 .

[23]  Pier Luca Lanzi,et al.  A Roadmap to the Last Decade of Learning Classifier System Research , 1999, Learning Classifier Systems.

[24]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[25]  Moisés Goldszmidt,et al.  Action Networks: A Framework for Reasoning about Actions and Change under Uncertainty , 1994, UAI.

[26]  Martin V. Butz,et al.  An algorithmic description of XCS , 2000, Soft Comput..

[27]  Pier Luca Lanzi,et al.  Learning classifier systems from a reinforcement learning perspective , 2002, Soft Comput..

[28]  Tim Kovacs,et al.  Advances in Learning Classifier Systems , 2001, Lecture Notes in Computer Science.

[29]  Stewart W. Wilson ZCS: A Zeroth Level Classifier System , 1994, Evolutionary Computation.

[30]  Olivier Sigaud,et al.  YACS: a new learning classifier system using anticipation , 2002, Soft Comput..