Robustness of stochastic stability in game theoretic learning

The notion of stochastic stability is used in game theoretic learning to characterize which joint actions of players exhibit high probabilities of occurrence in the long run. This paper examines the impact of two types of errors on stochastic stability: i) small unstructured uncertainty in the game parameters and ii) slow time variations of the game parameters. In the first case, we derive a continuity result bounds the effects of small uncertainties. In the second case, we show that game play tracks drifting stochastically stable states under sufficiently slow time variations. The analysis is in terms of Markov chains and hence is applicable to a variety of game theoretic learning rules. Nonetheless, the approach is illustrated on the widely studied rule of log-linear learning. Finally, the results are applied in both simulation and laboratory experiments to distributed area coverage with mobile robots.

[1]  Jeff S. Shamma,et al.  Communication, convergence, and stochastic stability in self-assembly , 2010, 49th IEEE Conference on Decision and Control (CDC).

[2]  CARL D. MEYER,et al.  The Condition of a Finite Markov Chain and Perturbation Bounds for the Limiting Probabilities , 1980, SIAM J. Algebraic Discret. Methods.

[3]  L. Blume The Statistical Mechanics of Strategic Interaction , 1993 .

[4]  Sonia Martínez,et al.  Coverage control for mobile sensing networks , 2002, IEEE Transactions on Robotics and Automation.

[5]  Maria-Florina Balcan,et al.  The Snowball Effect of Uncertainty in Potential Games , 2011, WINE.

[6]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[7]  Javad Mohammadpour,et al.  Control of linear parameter varying systems with applications , 2012 .

[8]  H. Young,et al.  The Evolution of Conventions , 1993 .

[9]  I.I. Hussein,et al.  Effective Coverage Control using Dynamic Sensor Networks with Flocking and Guaranteed Collision Avoidance , 2007, 2007 American Control Conference.

[10]  L. Shapley,et al.  REGULAR ARTICLEPotential Games , 1996 .

[11]  Nick Netzer,et al.  Robust stochastic stability , 2014 .

[13]  Jason R. Marden,et al.  Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  Kagan Tumer,et al.  Collective Intelligence for Control of Distributed Dynamical Systems , 1999, ArXiv.

[15]  Jason R. Marden,et al.  Cooperative Control and Potential Games , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Nick Netzer,et al.  The logit-response dynamics , 2010, Games Econ. Behav..

[17]  Richard M. Murray,et al.  Recent Research in Cooperative Control of Multivehicle Systems , 2007 .

[18]  Elias B. Kosmatopoulos,et al.  Adaptive-based distributed cooperative multi-robot coverage , 2011, Proceedings of the 2011 American Control Conference.

[19]  Allen B. MacKenzie,et al.  Using game theory to analyze wireless ad hoc networks , 2005, IEEE Communications Surveys & Tutorials.

[20]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[21]  V. Georgiev Using Game Theory to Analyze Wireless Ad Hoc Networks . ” , 2008 .

[22]  Jason R. Marden,et al.  Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[23]  Connections between cooperative control and potential games illustrated on the consensus problem , 2007, 2007 European Control Conference (ECC).

[24]  H. Young Individual Strategy and Social Structure , 2020 .

[25]  Jeff S. Shamma,et al.  Language evolution in finite populations , 2011, IEEE Conference on Decision and Control and European Control Conference.

[26]  Jason R. Marden,et al.  Surveying Game Theoretic Approaches for Wind Farm Optimization , 2012 .

[27]  M. Athans,et al.  Gain Scheduling: Potential Hazards and Possible Remedies , 1992, 1991 American Control Conference.

[28]  Asuman E. Ozdaglar,et al.  Dynamics in near-potential games , 2011, Games Econ. Behav..

[29]  György Dán,et al.  Cache-to-Cache: Could ISPs Cooperate to Decrease Peer-to-Peer Content Distribution Costs? , 2011, IEEE Transactions on Parallel and Distributed Systems.

[30]  Lynne E. Parker,et al.  Distributed Algorithms for Multi-Robot Observation of Multiple Moving Targets , 2002, Auton. Robots.

[31]  H. Peyton Young,et al.  Strategic Learning and Its Limits , 2004 .

[32]  Wilson J. Rugh,et al.  Research on gain scheduling , 2000, Autom..

[33]  Jeff S. Shamma,et al.  An Overview of LPV Systems , 2012 .

[34]  S. Hart Adaptive Heuristics , 2005 .

[35]  Shie Mannor,et al.  Multi-agent learning for engineers , 2007, Artif. Intell..

[36]  Stefano Carpin,et al.  Multirobot cooperation for surveillance of multiple moving targets - a new behavioral approach , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[37]  R. Rob,et al.  Learning, Mutation, and Long Run Equilibria in Games , 1993 .

[38]  L. Shapley,et al.  Potential Games , 1994 .

[39]  Yoav Shoham,et al.  If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..

[40]  Marios M. Polycarpou,et al.  Cooperative Control of Distributed Multi-Agent Systems , 2001 .