Opportunistic Synthesis in Reactive Games under Information Asymmetry

Reactive synthesis is a class of methods to construct a provably-correct control system, referred to as a robot, with respect to a temporal logic specification in the presence of a dynamic and uncontrollable environment. This is achieved by modeling the interaction between the robot and its environment as a two-player zero-sum game. However, existing reactive synthesis methods assume both players to have complete and symmetrical information, which is not the case in many strategic interactions. In this paper, we use a variant of hypergames to model the interaction between the robot and its environment; where the latter has incomplete information about the specification of the robot. We propose a novel method of opportunistic synthesis defined over the hypergame model to identify a subset of hypergame states from where the robot can leverage the asymmetrical information to achieve a better outcome, which is not possible if both players have symmetrical and complete information. By assuming the environment to play a stochastic strategy in its perceived sure-winning and sure-losing regions of the game, we show that by following the opportunistic strategy, the robot is ensured to only improve the outcome of the game—measured by satisfaction of sub-specifications—whenever an opportunity becomes available. We demonstrate the correctness and optimality of this method using a robot motion planning example in the presence of an adversary.

[1]  Amir Pnueli,et al.  On the synthesis of a reactive module , 1989, POPL '89.

[2]  Jie Fu,et al.  A Compositional Approach to Reactive Games under Temporal Logic Specifications , 2018, 2018 Annual American Control Conference (ACC).

[3]  Bahman Gharesifard,et al.  Evolution of Players' Misperceptions in Hypergames Under Perfect Observations , 2012, IEEE Transactions on Automatic Control.

[4]  Jie Fu,et al.  Minimum Violation Control Synthesis on Cyber-Physical Systems under Attacks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[5]  P. Bennett,et al.  The arms race as a hypergame: A study of routes towards a safer world☆ , 1982 .

[6]  Takashi Tomita,et al.  Simple Synthesis of Reactive Systems with Tolerance for Unexpected Environmental Behavior , 2016, 2016 IEEE/ACM 4th FME Workshop on Formal Methods in Software Engineering (FormaliSE).

[7]  Yeon-Koo Che,et al.  Asymmetric information about rivals' types in standard auctions , 2004, Games Econ. Behav..

[8]  P. Bennett,et al.  Toward a theory of hypergames , 1977 .

[9]  Ufuk Topcu,et al.  Synthesis of Control Protocols for Autonomous Systems , 2013 .

[10]  Zohar Manna,et al.  A hierarchy of temporal properties (invited paper, 1989) , 1990, PODC '90.

[11]  Ufuk Topcu,et al.  Correct, Reactive, High-Level Robot Control , 2011, IEEE Robotics & Automation Magazine.

[12]  Nicholas S. Kovach,et al.  A Temporal Framework For Hypergame Analysis Of Cyber Physical Systems In Contested Environments , 2016 .

[13]  M. R. Dando,et al.  Complex Strategic Analysis: A Hypergame Study of the Fall of France , 1979 .

[14]  Igor Walukiewicz,et al.  Permissive strategies: from parity games to safety games , 2002, RAIRO Theor. Informatics Appl..

[15]  Lyn C. Thomas,et al.  Conflict Analysis: Models and Resolutions , 1985 .

[16]  Ufuk Topcu,et al.  Formal Specification and Synthesis of Mission Plans for Unmanned Aerial Vehicles , 2014, AAAI Spring Symposia.

[17]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[18]  J. Erkoyuncu,et al.  Identifying information asymmetry challenges in the defence sector , 2018 .

[19]  Orna Kupfermant,et al.  Synthesis with Incomplete Informatio , 2000 .

[20]  Hao Liu,et al.  Learning Policies for Markov Decision Processes From Data , 2017, IEEE Transactions on Automatic Control.

[21]  Ufuk Topcu,et al.  Synthesis of Joint Control and Active Sensing Strategies Under Temporal Logic Constraints , 2016, IEEE Transactions on Automatic Control.

[22]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[23]  Wieslaw Zielonka,et al.  Infinite Games on Finitely Coloured Graphs with Applications to Automata on Infinite Trees , 1998, Theor. Comput. Sci..

[24]  Orna Kupferman,et al.  Model Checking of Safety Properties , 1999, Formal Methods Syst. Des..

[25]  Rüdiger Ehlers,et al.  How to Handle Assumptions in Synthesis , 2014, SYNT.

[26]  Alexandre Duret-Lutz,et al.  Spot 2 . 0 — a framework for LTL and ω-automata manipulation , 2016 .

[27]  Thomas A. Henzinger,et al.  Concurrent reachability games , 2007, Theor. Comput. Sci..

[28]  Maria Fox,et al.  Opportunistic Planning in Autonomous Underwater Missions , 2018, IEEE Transactions on Automation Science and Engineering.

[29]  A Hypergame Model for Information Security , 2014 .