Adaptive Multi-agent Programming in GTGolog

We present a novel approach to adaptive multi-agent programming, which is based on an integration of the agent programming language GTGolog with adaptive dynamic programming techniques. GTGolog combines explicit agent programming in Golog with multi-agent planning in stochastic games. A drawback of this framework, however, is that the transition probabilities and reward values of the domain must be known in advance and then cannot change anymore. But such data is often not available in advance and may also change over the time. The adaptive generalization of GTGolog in this paper is directed towards letting the agents themselves explore and adapt these data, which is more useful for realistic applications. We use high-level programs for generating both abstract states and optimal policies, which benefits from the deep integration between action theory and high-level programs in the Golog framework.

[1]  Frank Wolter,et al.  Semi-qualitative Reasoning about Distances: A Preliminary Report , 2000, JELIA.

[2]  Alexander Ferrein,et al.  Using Golog for Deliberation and Team Coordination in Robotic Soccer , 2005, Künstliche Intell..

[3]  Javier Pinto,et al.  Integrating Discrete and Continuous Change in a Logical Framework , 1998, Comput. Intell..

[4]  J. Neumann,et al.  Theory of games and economic behavior, 2nd rev. ed. , 1947 .

[5]  Alberto Finzi,et al.  Combining Probabilities, Failures and Safety in Robot Control , 2001, IJCAI.

[6]  Craig Boutilier,et al.  Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[7]  Simon Parsons,et al.  Knowledge in action: Logical foundations for specifying and implementing dynamical systems by Raymond Reiter, MIT Press, 0-262-18218-1, 448 pp., $60.00/£38.95 , 2005, The Knowledge Engineering Review.

[8]  Michael Thielscher Programming of Reasoning and Planning Agents with FLUX , 2002, KR 2002.

[9]  G. Owen,et al.  Game Theory (2nd Ed.). , 1983 .

[10]  David Andre,et al.  State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.

[11]  Thomas Lukasiewicz,et al.  Game theoretic Golog under partial observability , 2005, AAMAS '05.

[12]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[13]  Bhaskara Marthi,et al.  Concurrent Hierarchical Reinforcement Learning , 2005, IJCAI.

[14]  John McCarthy,et al.  SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[15]  Thomas Lukasiewicz,et al.  Relational Markov Games , 2004, JELIA.

[16]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[17]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[18]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[19]  Thomas Lukasiewicz,et al.  Game-Theoretic Agent Programming in Golog , 2004, ECAI.

[20]  R. Milner Mathematical Centre Tracts , 1976 .

[21]  Raymond Reiter,et al.  Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems , 2001 .

[22]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[23]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[24]  Hector J. Levesque,et al.  GOLOG: A Logic Programming Language for Dynamic Domains , 1997, J. Log. Program..

[25]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[26]  Sylvie Thiébaux,et al.  Exploiting First-Order Regression in Inductive Policy Selection , 2004, UAI.

[27]  Craig Boutilier,et al.  Decision-Theoretic, High-Level Agent Programming in the Situation Calculus , 2000, AAAI/IAAI.

[28]  Leon A. Petrosyan,et al.  Game Theory (Second Edition) , 1996 .