Control of large distributed systems using games with pure strategy Nash equilibria

Control mechanisms for optimisation in large distributed systems cannot be constructed based on traditional methods of control because they are typically characterised by distributed information and costly and/or noisy communication. Furthermore, noisy observations and dynamism are also inherent to these systems, so their control mechanisms need to be flexible, agile and robust in the face of these characteristics. In such settings, a good control mechanism should satisfy the following four design requirements: (i) it should produce high quality solutions, (ii) it should be robustness and flexibility in the face of additions, removals and failures of components, (iii) it should operate by making limited use of communication, and (iv) its operation should be computational feasible. Against this background, in order to satisfy these requirements, in this thesis we adopt a design approach based on dividing control over the system across a team of self–interested agents. Such multi–agent systems (MAS) are naturally distributed (matching the application domains in question), and by pursing their own private goals, the agents can collectively implement robust, flexible and scalable control mechanisms. In more detail, the design approach we adopt is (i) to use games with pure strategy Nash equilibria as a framework or template for constructing the agents’ utility functions, such that good solutions to the optimisation problem arise at the pure strategy Nash equilibria of the game, and (ii) to derive distributed techniques for solving the games for their Nash equilibria. The specific problems we tackle can be grouped into four main topics. First, we investigate a class of local algorithms for distributed constraint optimisation problems (DCOPs). We introduce a unifying analytical framework for studying such algorithms, and develop a parameterisation of the algorithm design space, which represents a mapping from the algorithms’ components to their performance according to each of our design requirements. Second, we develop a game–theoretic control mechanism for distributed dynamic task allocation and scheduling problems. The model in question is an expansion of DCOPs to encompass dynamic problems, and the control mechanism we derive builds on the insights from our first topic to address our four design requirements. Third, we elaborate a general class of problems including DCOPs with noisy rewards and state observations, which are realistic traits of great concern in real–world problems, and derive control mechanisms for these environments. These control mechanism allow the agents to either learn their reward functions or decide when to make observations of the world’s state and/or communicate their beliefs over the state of the world, in such a manner that they perform well according to our design requirements. Fourth, we derive an optimal algorithm for computing and optimising over pure strategy Nash equilibria in games with sparse interaction structure. By exploiting the structure present in many multi-agent interactions, this distributed algorithm can efficiently compute equilibria that optimise various criteria, thus reducing the computational burden on any one agent and operating using less communication than an equivalent centralised algorithms. For each of these topics, the control mechanisms that we derive are developed such that they perform well according to all four f our design requirements. In sum, by making the above contributions to these specific topics, we demonstrate that the general approach of using games with pure strategy Nash equilibria as a template for designing MAS produces good control mechanisms for large distributed systems.

[1]  D. M. Topkis Equilibrium Points in Nonzero-Sum n-Person Submodular Games , 1979 .

[2]  Rina Dechter,et al.  Bucket Elimination: A Unifying Framework for Reasoning , 1999, Artif. Intell..

[3]  Tuomas Sandholm,et al.  Distributed rational decision making , 1999 .

[4]  V. Crawford Adaptive dynamics in coordination games , 1995 .

[5]  William H. Sandholm,et al.  ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .

[6]  Gerard Tel,et al.  Introduction to Distributed Algorithms: Contents , 2000 .

[7]  Bart Selman,et al.  Optimal Multi-Agent Scheduling with Constraint Programming , 2007, AAAI.

[8]  Nicholas R. Jennings,et al.  Sequential decision making with untrustworthy service providers , 2008, AAMAS.

[9]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[10]  M. Jackson A Survey of Models of Network Formation: Stability and Efficiency , 2003 .

[11]  Vincent Conitzer,et al.  Mixed-Integer Programming Methods for Finding Nash Equilibria , 2005, AAAI.

[12]  R. Aumann Subjectivity and Correlation in Randomized Strategies , 1974 .

[13]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[14]  J. Friedman A Non-cooperative Equilibrium for Supergames , 1971 .

[15]  Marco Spuri,et al.  Deadline Scheduling for Real-Time Systems , 2011 .

[16]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[17]  Victor R. Lesser,et al.  Communication decisions in multi-agent cooperation: model and experiments , 2001, AGENTS '01.

[18]  Boi Faltings,et al.  Coordinating Agent Plans Through Distributed Constraint Optimization , 2008 .

[19]  Stephen Morris,et al.  P-dominance and belief potential , 2010 .

[20]  V. R. Lesser,et al.  Asynchronous Partial Overlay: A New Algorithm for Solving Distributed Constraint Satisfaction Problems , 2011, J. Artif. Intell. Res..

[21]  H. Young,et al.  The Evolution of Conventions , 1993 .

[22]  J-J.Ch. Meyer,et al.  A coordination language for agents interacting in distributed plan-execute cycles , 2009, Int. J. Reason. based Intell. Syst..

[23]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[24]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[25]  T. Payne,et al.  Flexible service provisioning with advance agreements , 2008, AAMAS.

[26]  Constantinos Daskalakis,et al.  Computing Pure Nash Equilibria via Markov Random Fields , 2005, ArXiv.

[27]  S. Hart,et al.  A Reinforcement Procedure Leading to Correlated Equilibrium , 2001 .

[28]  Thomas Hobbes,et al.  LEVIATHAN Or the Matter Forme and Power of a Commonwealth Ecclesiasticall and Civil , 1946 .

[29]  Moshe Tennenholtz,et al.  Strong and Correlated Strong Equilibria in Monotone Congestion Games , 2006, WINE.

[30]  Matthew L. Ginsberg,et al.  Dynamic Backtracking , 1993, J. Artif. Intell. Res..

[31]  Boi Faltings,et al.  A Scalable Method for Multiagent Constraint Optimization , 2005, IJCAI.

[32]  Makoto Yokoo,et al.  Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[33]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[34]  Archie C. Chapman,et al.  Benchmarking hybrid algorithms for distributed constraint optimisation games , 2010, Autonomous Agents and Multi-Agent Systems.

[35]  Daphne Koller,et al.  Multi-agent algorithms for solving graphical games , 2002, AAAI/IAAI.

[36]  Tim Roughgarden,et al.  Selfish routing and the price of anarchy , 2005 .

[37]  Ulrich Berger,et al.  Brown's original fictitious play , 2007, J. Econ. Theory.

[38]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Manish Jain,et al.  On k-optimal distributed constraint optimization algorithms: new bounds and algorithms , 2008, AAMAS.

[40]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .

[41]  M. Yokoo,et al.  Distributed Breakout Algorithm for Solving Distributed Constraint Satisfaction Problems , 1996 .

[42]  Dino Gerardi,et al.  Unmediated Communication in Games with Complete and Incomplete Information , 2002, J. Econ. Theory.

[43]  A. Neyman Correlated equilibrium and potential games , 1997 .

[44]  Nicholas R. Jennings,et al.  Agent-based control systems , 2003 .

[45]  Moshe Tennenholtz,et al.  Local-Effect Games , 2003, IJCAI.

[46]  M. Whinston,et al.  Coalition-Proof Nash Equilibria I. Concepts , 1987 .

[47]  Sarvapali D. Ramchurn,et al.  Trust-Based Mechanisms for Robust and Efficient Task Allocation in the Presence of Execution Uncertainty , 2009, J. Artif. Intell. Res..

[48]  NICHOLAS R. JENNINGS,et al.  An agent-based approach for building complex software systems , 2001, CACM.

[49]  Alex Rogers,et al.  A multi-agent simulation system for prediction and scheduling of aero engine overhaul , 2008, AAMAS.

[50]  Michael L. Littman,et al.  Graphical Models for Game Theory , 2001, UAI.

[51]  J. G. Wardrop,et al.  Some Theoretical Aspects of Road Traffic Research , 1952 .

[52]  Hugh F. Durrant-Whyte,et al.  Scalable Decentralised Control for Multi-Platform Reconnaissance and Information Gathering Tasks , 2006, 2006 9th International Conference on Information Fusion.

[53]  Henk Hesselink,et al.  Scheduling Aircraft Using Constraint Satisfaction , 2002, WFLP.

[54]  Kevin Leyton-Brown,et al.  Computing Nash Equilibria of Action-Graph Games , 2004, UAI.

[55]  Xiaofeng Wang,et al.  Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.

[56]  Moshe Tennenholtz,et al.  Asynchronous Congestion Games , 2009, Graph Theory, Computational Intelligence and Thought.

[57]  Claudio Mezzetti,et al.  Learning in Games by Random Sampling , 2001, J. Econ. Theory.

[58]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[59]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[60]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[61]  S. Clearwater Market-based control: a paradigm for distributed resource allocation , 1996 .

[62]  Yang Xu,et al.  An integrated token-based algorithm for scalable coordination , 2005, AAMAS '05.

[63]  Nicholas R. Jennings,et al.  Bidding optimally in concurrent second-price auctions of perfectly substitutable goods , 2007, AAMAS '07.

[64]  Christos H. Papadimitriou,et al.  The complexity of pure Nash equilibria , 2004, STOC '04.

[65]  Jason R. Marden,et al.  Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation , 2010 .

[66]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..

[67]  David S. Leslie,et al.  Generalised weakened fictitious play , 2006, Games Econ. Behav..

[68]  Richard J. Wallace,et al.  Enhancements of Branch and Bound Methods for the Maximal Constraint Satisfaction Problem , 1996, AAAI/IAAI, Vol. 1.

[69]  Moshe Tennenholtz,et al.  Congestion games with load-dependent failures: Identical resources , 2009, Games Econ. Behav..

[70]  Thomas Voice Stability of multi-path dual congestion control algorithms , 2007, TNET.

[71]  Mohammad Hayajneh,et al.  Distributed joint rate and power control game-theoretic algorithms for wireless data , 2004, IEEE Communications Letters.

[72]  Nic Wilson,et al.  Semiring induced valuation algebras: Exact and approximate local computation algorithms , 2008, Artif. Intell..

[73]  Edith Elkind,et al.  Computing good nash equilibria in graphical games , 2007, EC '07.

[74]  Robert Wilson,et al.  Computing Nash equilibria by iterated polymatrix approximation , 2004 .

[75]  R. Vohra,et al.  Calibrated Learning and Correlated Equilibrium , 1996 .

[76]  Allen B. MacKenzie,et al.  Using Game Theory to Analyze Physical Layer Cognitive Radio Algorithms , 2005 .

[77]  Tiina Heikkinen,et al.  A potential game approach to distributed power control and scheduling , 2006, Comput. Networks.

[78]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.

[79]  Krzysztof R. Apt,et al.  Principles of constraint programming , 2003 .

[80]  Martin C. Cooper,et al.  Optimal Soft Arc Consistency , 2007, IJCAI.

[81]  Martin J. Wainwright,et al.  Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..

[82]  Krishna C. Jha,et al.  Exact and Heuristic Methods for the Weapon Target Assignment Problem , 2003 .

[83]  R. Bellman A Markovian Decision Process , 1957 .

[84]  Nicholas R. Jennings,et al.  Decentralised coordination of low-power embedded devices using the max-sum algorithm , 2008, AAMAS.

[85]  Jason R. Marden,et al.  Joint Strategy Fictitious Play with Inertia for Potential Games , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[86]  Toni Mancini,et al.  Complexity of Pure Equilibria in Bayesian Games , 2007, IJCAI.

[87]  André de Palma,et al.  Discrete Choice Theory of Product Differentiation , 1995 .

[88]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[89]  Jeffrey S. Rosenschein,et al.  Achieving Allocatively-Efficient and Strongly Budget-Balanced Mechanisms in the Network Flow Domain for Bounded-Rational Agents , 2005, IJCAI.

[90]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[91]  Maja J. Mataric,et al.  Sold!: auction methods for multirobot coordination , 2002, IEEE Trans. Robotics Autom..

[92]  Makoto Yokoo,et al.  The distributed breakout algorithms , 2005, Artif. Intell..

[93]  H. Peyton Young,et al.  Individual Strategy and Social Structure , 2020 .

[94]  Theodore Groves,et al.  Incentives in Teams , 1973 .

[95]  Michael P. Wellman,et al.  Auction Protocols for Decentralized Scheduling , 2001, Games Econ. Behav..

[96]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[97]  C. Mezzetti Mechanism Design with Interdependent Valuations: Efficiency , 2004 .

[98]  Weixiong Zhang,et al.  Distributed stochastic search and distributed breakout: properties, comparison and applications to constraint optimization problems in sensor networks , 2005, Artif. Intell..

[99]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[100]  A. C. Pigou Economics of welfare , 1920 .

[101]  Paul Morris,et al.  The Breakout Method for Escaping from Local Minima , 1993, AAAI.

[102]  Takashi Ui,et al.  Discrete Concavity for Potential Games , 2008, IGTR.

[103]  Manish Jain,et al.  Computing optimal randomized resource allocations for massive security games , 2009, AAMAS 2009.

[104]  Peter Rossmanith,et al.  Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[105]  R. Vijay Krishna,et al.  Communication in games of incomplete information: Two players , 2007, J. Econ. Theory.

[106]  John C. Harsanyi,et al.  Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .

[107]  Stephen Fitzpatrick,et al.  Distributed Coordination through Anarchic Optimization , 2003 .

[108]  David S. Leslie,et al.  Individual Q-Learning in Normal Form Games , 2005, SIAM J. Control. Optim..

[109]  Jason R. Marden,et al.  Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[110]  Michael P. Wellman A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[111]  Michael P. Wellman,et al.  Constraint satisfaction algorithms for graphical games , 2007, AAMAS '07.

[112]  P. Dasgupta Trust as a commodity , 1988 .

[113]  Luis E. Ortiz,et al.  Nash Propagation for Loopy Graphical Games , 2002, NIPS.

[114]  Submodular Games,et al.  EQUILIBRIUM POINTS IN NONZERO-SUM n-PERSON , 1979 .

[115]  Robert J. Aumann,et al.  16. Acceptable Points in General Cooperative n-Person Games , 1959 .

[116]  Nicholas R. Jennings,et al.  Maximising Sensor Network Efficiency Through Agent-Based Coordination of Sense/Sleep Schedules , 2008, DCoSS 2008.

[117]  Paul Scerri,et al.  Coordination of Large-Scale Multiagent Systems , 2005 .

[118]  Archie C. Chapman,et al.  A Parameterisation of Algorithms for Distributed Constraint Optimisation via Potential Games , 2008 .

[119]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[120]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions, Part II: Applications , 2006, Math. Oper. Res..

[121]  Catriel Beeri,et al.  On the Desirability of Acyclic Database Schemes , 1983, JACM.

[122]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[123]  Sven Koenig,et al.  Sequential Bundle-Bid Single-Sale Auction Algorithms for Decentralized Control , 2007, IJCAI.

[124]  Archie C. Chapman,et al.  Decentralised dynamic task allocation: a practical game: theoretic approach , 2009, AAMAS.

[125]  C. E. Lemke,et al.  Equilibrium Points of Bimatrix Games , 1964 .

[126]  Georg Gottlob,et al.  Hypertree Decompositions: Structure, Algorithms, and Applications , 2005, WG.

[127]  R. Aumann Correlated Equilibrium as an Expression of Bayesian Rationality Author ( s ) , 1987 .

[128]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[129]  Nicholas R. Jennings,et al.  Decentralized control of adaptive sampling in wireless sensor networks , 2009, TOSN.

[130]  Weixiong Zhang,et al.  A Comparative Study of Distributed Constraint Algorithms , 2003 .

[131]  J. Robinson AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[132]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[133]  Francesco Scarcello,et al.  Constrained Pure Nash Equilibria in Graphical Games , 2004, ECAI.

[134]  Kagan Tumer,et al.  Collectives and Design Complex Systems , 2004 .

[135]  Koen V. Hindriks,et al.  A Programming Language for Coordinating Group Actions , 2001, CEEMAS.

[136]  D. Fudenberg,et al.  Consistency and Cautious Fictitious Play , 1995 .

[137]  Victor R. Lesser,et al.  Solving distributed constraint optimization problems using cooperative mediation , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[138]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[139]  Thomas Schiex,et al.  Valued Constraint Satisfaction Problems: Hard and Easy Problems , 1995, IJCAI.

[140]  John C. Harsanyi,et al.  Games with Incomplete Information Played by "Bayesian" Players, I-III: Part I. The Basic Model& , 2004, Manag. Sci..