A Model of Partially Observable State Game and its Optimality

In this paper we present a model of two-players partially observable “state-game” and study its optimality. The model is inspired by the practical problem of negotiation in a multi-agent system and formulates, from a the game theory point of view, the so-called contract net protocol. It covers a wide variety of real problems including some simple card games such as blackjack, and many negotiation and bargaining situations. The results that follow are valid for non-zero-sum games as well as for zero-sum games. Basically, we establish and prove the relation between partially observable state games and some classical (single-state) bi-matrix games. If the original state game is zero-sum, then the equivalent bi-matrix game is so.

[1]  Eithan Ephrati,et al.  The Clarke Tax as a Consensus Mechanism Among Automated Agents , 1991, AAAI.

[2]  Tuomas Sandholm,et al.  An Implementation of the Contract Net Protocol Based on Marginal Cost Calculations , 1993, AAAI.

[3]  Gerhard Weiss,et al.  Learning to Coordinate Actions in Multi-Agent-Systems , 1993, IJCAI.

[4]  Matteo Golfarelli A Game Theory Approach to Coordination in MAS , 1998, ECAI.

[5]  H. Young,et al.  The Evolution of Conventions , 1993 .

[6]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[7]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[8]  Victor R. Lesser,et al.  Issues in Automated Negotiation and Electronic Commerce: Extending the Contract Net Framework , 1997, ICMAS.

[9]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[10]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[11]  Csaba Szepesvári,et al.  A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.

[12]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[13]  Robert J. Aumann,et al.  Repeated Games with Incomplete Information , 1995 .

[14]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[15]  M. Golfarelli,et al.  Multi-agent Path Planning Based on Task-swap Negotiation , 1997 .

[16]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[17]  Moshe Tennenholtz,et al.  On the Synthesis of Useful Social Laws for Artificial Agent Societies (Preliminary Report) , 1992, AAAI.

[18]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[19]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[20]  Jeffrey S. Rosenschein and Gilad Zlotkin Rules of Encounter , 1994 .

[21]  Andrew W. Moore,et al.  Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[22]  Reid G. Smith,et al.  The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver , 1980, IEEE Transactions on Computers.

[23]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[24]  Jeffrey S. Rosenschein,et al.  Mechanism Design for Automated Negotiation, and its Application to Task Oriented Domains , 1996, Artif. Intell..

[25]  J. Harsanyi Games with Incomplete Information Played by 'Bayesian' Players, Part III. The Basic Probability Distribution of the Game , 1968 .

[26]  Moshe Tennenholtz,et al.  Emergent Conventions in Multi-Agent Systems: Initial Experimental Results and Observations (Preliminary Report) , 1992, KR.

[27]  Craig Boutilier,et al.  Learning Conventions in Multiagent Stochastic Domains using Likelihood Estimates , 1996, UAI.

[28]  David Lewis Convention: A Philosophical Study , 1986 .

[29]  O. Mangasarian,et al.  Two-person nonzero-sum games and quadratic programming , 1964 .

[30]  Ellery Eells,et al.  Choices: An Introduction to Decision Theory. , 1990 .