Reconciling Rationality and Stochasticity: Rich Behavioral Models in Two-Player Games

Two traditional paradigms are often used to describe the behavior of agents in multi-agent complex systems. In the first one, agents are considered to be fully rational and systems are seen as multi-player games. In the second one, agents are considered to be fully stochastic processes and the system itself is seen as a large stochastic process. From the standpoint of a particular agent - having to choose a strategy, the choice of the paradigm is crucial: the most adequate strategy depends on the assumptions made on the other agents. In this paper, we focus on two-player games and their application to the automated synthesis of reliable controllers for reactive systems - a field at the crossroads between computer science and mathematics. In this setting, the reactive system to control is a player, and its environment is its opponent, usually assumed to be fully antagonistic or fully stochastic. We illustrate several recent developments aiming to breach this narrow taxonomy by providing formal concepts and mathematical frameworks to reason about richer behavioral models. The interest of such models is not limited to reactive system synthesis but extends to other application fields of game theory. The goal of our contribution is to give a high-level presentation of key concepts and applications, aimed at a broad audience. To achieve this goal, we illustrate those rich behavioral models on a classical challenge of the everyday life: planning a journey in an uncertain environment.

[1]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Krishnendu Chatterjee,et al.  The complexity of multi-mean-payoff and multi-energy games , 2012, Inf. Comput..

[4]  Thomas Wilke,et al.  Automata logics, and infinite games: a guide to current research , 2002 .

[5]  Yoshio Ohtsubo,et al.  Markov decision processes associated with two threshold probability criteria , 2013 .

[6]  Amir Pnueli,et al.  On the synthesis of a reactive module , 1989, POPL '89.

[7]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[8]  Yoshio Ohtsubo,et al.  Optimal threshold probability in undiscounted Markov decision processes with a target set , 2004, Appl. Math. Comput..

[9]  Benjamin Monmege,et al.  To Reach or not to Reach? Efficient Algorithms for Total-Payoff Games , 2014, CONCUR.

[10]  Christel Baier,et al.  Computing Quantiles in Markov Reward Models , 2013, FoSSaCS.

[11]  Edmund M. Clarke,et al.  Design and Synthesis of Synchronization Skeletons Using Branching Time Temporal Logic , 2008, 25 Years of Model Checking.

[12]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[13]  Mickael Randour,et al.  Variations on the Stochastic Shortest Path Problem , 2014, VMCAI.

[14]  Véronique Bruyère,et al.  Expectations or Guarantees? I Want It All! A crossroad between games and MDPs , 2014, SR.

[15]  Krishnendu Chatterjee,et al.  Looking at mean-payoff and total-payoff through windows , 2015, Inf. Comput..

[16]  A. Ehrenfeucht,et al.  Positional strategies for mean payoff games , 1979 .

[17]  Mickael Randour,et al.  Percentile queries in multi-dimensional Markov decision processes , 2014, CAV.

[18]  Vladimir Gurvich,et al.  On Short Paths Interdiction Problems: Total and Node-Wise Limited Interdiction , 2008, Theory of Computing Systems.

[19]  Kim G. Larsen,et al.  Average-energy games , 2015, Acta Informatica.

[20]  Véronique Bruyère,et al.  Synthesis from LTL Specifications with Mean-Payoff Objectives , 2012, TACAS.

[21]  Christoph Haase,et al.  The Odds of Staying on Budget , 2014, ICALP.

[22]  Thomas A. Henzinger,et al.  Resource Interfaces , 2003, EMSOFT.

[23]  Jean-François Raskin,et al.  Quantitative Languages Defined by Functional Automata , 2011, CONCUR.

[24]  Mickael Randour,et al.  Automated synthesis of reliable and efficient systems through game theory: a case study , 2012, ArXiv.

[25]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[26]  Andrew V. Goldberg,et al.  Shortest paths algorithms: Theory and experimental evaluation , 1994, SODA '94.

[27]  Uri Zwick,et al.  The Complexity of Mean Payoff Games on Graphs , 1996, Theor. Comput. Sci..

[28]  Vijay V. Vazirani,et al.  Solvency Games , 2008, Electron. Colloquium Comput. Complex..

[29]  P. Ramadge,et al.  Supervisory control of a class of discrete event processes , 1987 .

[30]  Krishnendu Chatterjee,et al.  Strategy synthesis for multi-dimensional quantitative objectives , 2012, Acta Informatica.

[31]  Krishnendu Chatterjee,et al.  Better Quality in Synthesis through Quantitative Objectives , 2009, CAV.

[32]  Moshe Y. Vardi Automatic verification of probabilistic concurrent finite state programs , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[33]  Christel Baier,et al.  Principles of model checking , 2008 .

[34]  Lorenzo Clemente,et al.  Non-Zero Sum Games for Reactive Synthesis , 2015, LATA.

[35]  Véronique Bruyère,et al.  Meet Your Expectations With Guarantees: Beyond Worst-Case Synthesis in Quantitative Games , 2013, STACS.

[36]  Pierre Wolper,et al.  An Automata-Theoretic Approach to Automatic Program Verification (Preliminary Report) , 1986, LICS.