论文信息 - Adversarial Reinforcement Learning

Adversarial Reinforcement Learning

Reinforcement Learning has been used for a number of years in single agent environments. This article reports on our investigation of Reinfor cement Learning techniques in a multi-agent and adversarial environment with continuous o b ervable state information. We introduce a new framework, two-player hexagonal grid soccer, in which to evaluate algorithms. We then compare the performance of several single-agent Rei forcement Learning techniques in that environment. These are further compared to a previou sly developed adversarial Reinforcement Learning algorithm designed for Markov games. Bu ilding upon these efforts, we introduce new algorithms to handle the multi-agent, the adv ersarial, and the continuous-valued aspects of the domain. We introduce a technique for modellin g the opponent in an adversarial game. We introduce an extension to Prioritized Sweeping tha t allows generalization of learnt knowledge over neighboring states in the domain; and we intr oduce an extension to the U Tree generalizing algorithm that allows the handling of continu ous state spaces. Extensive empirical evaluation is conducted in the grid soccer domain.

William T. B. Uther | M. Veloso

[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[2] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[3] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[4] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[5] Sebastian Thrun,et al. Learning to Play the Game of Chess , 1994, NIPS.

[6] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[7] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[8] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[9] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[10] Andrew W. Moore,et al. Learning Evaluation Functions for Large Acyclic Domains , 1996, ICML.

[11] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .

[12] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[13] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[14] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.