The Evolution of Strategies for Multiagent Environments

SAMUEL is an experimental learning system that uses genetic algorithms and other learning methods to evolve reactive decision rules from simulations of multiagent environments. The basic approach is to explore a range of behavior within a simulation model, using feedback to adapt its decision strategies over time. One of the main themes in this research is that the learning system should be able to take advantage of existing knowledge where available. This has led to the adoption of rule representations that ease the expression of existing knowledge. A second theme is that adaptation can be driven by competition among knowledge structures. Competition is applied at two levels in SAMUEL. Within a strategy composed of decision rules, rules compete with one another to influence the behavior of the system. At a higher level of granularity, entire strategies compete with one another, driven by a genetic algorithm. This article focuses on recent elaborations of the agent model of SAMUEL that are specifically designed to respond to multiple external agents. Experimental results are presented that illustrate the behavior of SAMUEL on two multiagent predator-prey tasks.

[1]  John J. Grefenstette,et al.  Explanations of Empirically Derived Reactive Plans , 1990, ML.

[2]  John J. Grefenstette,et al.  Learning the Persistence of Actions in Reactive Control Rules , 1991, ML.

[3]  Kenneth de Jong,et al.  Genetic-algorithm-based learning , 1990 .

[4]  Diana F. Gordon An enhancer for reactive plans , 1991 .

[5]  John J. Grefenstette,et al.  Simulation-Assisted Learning by Competition: Effects of Noise Differences Between Training Model and Target Environment , 1990, ML.

[6]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[7]  Rick L. Riolo,et al.  Bucket Brigade Performance: II. Default Hierarchies , 1987, ICGA.

[8]  Oren Etzioni,et al.  PRODIGY: an integrated architecture for planning and learning , 1991, SGAR.

[9]  John R. Koza,et al.  Hierarchical Genetic Algorithms Operating on Populations of Computer Programs , 1989, IJCAI.

[10]  David Chapman,et al.  Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[11]  Stewart W. Wilson Hierarchical Credit Allocation in a Classifier System , 1987, IJCAI.

[12]  Alan C. Schultz,et al.  Using a Genetic Algorithm to Learn Strategies for Collision Avoidance and Local Navigation. , 1990 .

[13]  S. Gould The Panda's Thumb , 1980 .

[14]  Stephen F. Smith,et al.  A learning system based on genetic adaptive algorithms , 1980 .

[15]  John H. Holland,et al.  Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .

[16]  John J. Grefenstette,et al.  Lamarckian Learning in Multi-Agent Environments , 1991, ICGA.

[17]  Richard S. Sutton,et al.  Learning and Sequential Decision Making , 1989 .

[18]  John J. Grefenstette,et al.  Improving tactical plans with genetic algorithms , 1990, [1990] Proceedings of the 2nd International IEEE Conference on Tools for Artificial Intelligence.

[19]  L. Booker Classifier Systems that Learn Internal World Models , 2005, Machine Learning.

[20]  Lashon B. Booker,et al.  Representing Attribute-Based Concepts in a Classifier System , 1990, FOGA.

[21]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[22]  Marcel Schoppers,et al.  Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.