Social Agents Playing a Periodical Policy

Coordination is an important issue in multiagent systems. Within the stochastic game framework this problem translates to policy learning in a joint action space. This technique however suffers some important drawbacks like the assumption of the existence of a unique Nash equilibrium and synchronicity, the need for central control, the cost of communication, etc. Moreover in general sum games it is not always clear which policies should be learned. Playing pure Nash equilibria is often unfair to at least one of the players, while playing a mixed strategy doesn't give any guarantee for coordination and usually results in a sub-optimal payoff for all agents. In this work we show the usefulness of periodical policies, which arise as a side effect of the fairness conditions used by the agents. We are interested in games which assume competition between the players, but where the overall performance can only be as good as the performance of the poorest player. Players are social distributed reinforcement learners, who have to learn to equalize their payoff. Our approach is illustrated on synchronous one-step games as well as on asynchronous job scheduling games.