A Unified Approach to Dynamic Decision Problems with Asymmetric Information - Part II: Strategic Agents

We study a general class of dynamic multi- agent decision problems with asymmetric information and nonstrategic agents, which include dynamic teams as a special case. When agents are nonstrategic, an agent’s strategy is known to the other agents. Nevertheless, the agents’ strategy choices and beliefs are interdependent over times, a phenomenon known as signaling. We introduce the notion of sufficient information that effectively compresses the agents’ information in a mutually consistent manner. Based on the notion of sufficient information, we propose an information state for each agent that is sufficient for decision-making purposes. We present instances of dynamic multiagent decision problems where we can determine an information state with a time-invariant domain for each agent. Furthermore, we present a generalization of the policy-independence property of belief in partially observed Markov decision processes (POMDP) to dynamic multiagent decision problems. Within the context of dynamic teams with asymmetric information, the proposed set of information states leads to a sequential decomposition that decouples the interdependence between the agents’ strategies and beliefs over time and enables us to formulate a dynamic program to determine a globally optimal policy via backward induction.

[1]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[2]  Robert J. Weber,et al.  Distributional Strategies for Games with Incomplete Information , 1985, Math. Oper. Res..

[3]  Tamer Basar,et al.  Common Information Based Markov Perfect Equilibria for Stochastic Games With Asymmetric Information: Finite Games , 2014, IEEE Transactions on Automatic Control.

[4]  Sanjay Lall,et al.  Optimal Control of Two-Player Systems With Output Feedback , 2013, IEEE Transactions on Automatic Control.

[5]  Pierpaolo Battigalli,et al.  Strategic Independence and Perfect Bayesian Equilibria , 1996 .

[6]  David A. Miller Robust collusion with private information , 2012 .

[7]  Yu-Chi Ho Team decision theory and information structures , 1980, Proceedings of the IEEE.

[8]  J. Walrand,et al.  On delayed sharing patterns , 1978 .

[9]  Yi Ouyang,et al.  Dynamic Games With Asymmetric Information: Common Information Based Perfect Bayesian Equilibria and Sequential Decomposition , 2015, IEEE Transactions on Automatic Control.

[10]  Hamidreza Tavafoghi,et al.  Dynamic Market Mechanisms for Wind Energy , 2016, ArXiv.

[11]  Sanjay Lall,et al.  A Characterization of Convex Problems in Decentralized Control$^ast$ , 2005, IEEE Transactions on Automatic Control.

[12]  H. J. Jacobsen,et al.  The One-Shot-Deviation Principle for Sequential Rationality , 1996 .

[13]  Pablo A. Parrilo,et al.  ℋ2-optimal decentralized control over posets: A state space solution for state-feedback , 2010, 49th IEEE Conference on Decision and Control (CDC).

[14]  Yi Ouyang,et al.  Dynamic oligopoly games with private Markovian dynamics , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[15]  Yi Ouyang,et al.  Optimal local and remote controllers with unreliable communication , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[16]  Ashutosh Nayyar,et al.  Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach , 2012, IEEE Transactions on Automatic Control.

[17]  John Doyle,et al.  On the structure of state-feedback LQG controllers for distributed systems with communication delays , 2011, IEEE Conference on Decision and Control and European Control Conference.

[18]  Ashutosh Nayyar,et al.  Structural results for partially nested LQG systems over graphs , 2015, 2015 American Control Conference (ACC).

[19]  T. Başar Decentralized multicriteria optimization of linear stochastic systems , 1978 .

[20]  Jeff S. Shamma,et al.  Efficient Strategy Computation in Zero-Sum Asymmetric Repeated Games , 2017, ArXiv.

[21]  Johannes Horner,et al.  Recursive Methods in Discounted Stochastic Games: An Algorithm for δ → 1 and a Folk Theorem , 2010 .

[22]  Nicolas Vieille,et al.  Markov Games with Frequent Actions and Incomplete Information , 2013 .

[23]  H. Witsenhausen A Counterexample in Stochastic Optimum Control , 1968 .

[24]  Seyed Mohammad Asghari,et al.  Dynamic Teams and Decentralized Control Problems With Substitutable Actions , 2016, IEEE Transactions on Automatic Control.

[25]  Jeff S. Shamma,et al.  An LP Approach for Solving Two-Player Zero-Sum Repeated Bayesian Games , 2017, IEEE Transactions on Automatic Control.

[26]  Kim C. Border,et al.  Fixed point theorems with applications to economics and game theory: References , 1985 .

[27]  G. Mailath,et al.  Repeated Games and Reputations , 2006 .

[28]  Tsuneo Yoshikawa,et al.  Decomposition of Dynamic Team Decision Problems , 1977 .

[29]  Sanjay Lall,et al.  Convexity of Decentralized Controller Synthesis , 2013, IEEE Transactions on Automatic Control.

[30]  M. Morsy,et al.  on Optimization , 2014 .

[31]  B. Kurtaran Corrections and extensions to "Decentralized stochastic control with delayed sharing information pattern" , 1979 .

[32]  Yi Ouyang,et al.  A Sufficient Information Approach to Decentralized Decision Making , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[33]  Hans S. Witsenhausen,et al.  A standard form for sequential stochastic control , 1973, Mathematical systems theory.

[34]  Shmuel Zamir,et al.  Repeated games of incomplete information: Zero-sum , 1992 .

[35]  S. Sorin A First Course on Zero Sum Repeated Games , 2002 .

[36]  Serdar Yüksel,et al.  Stochastic Nestedness and the Belief Sharing Information Pattern , 2009, IEEE Transactions on Automatic Control.

[37]  Robert J. Aumann,et al.  Repeated Games with Incomplete Information , 1995 .

[38]  Eric Maskin,et al.  Markov Perfect Equilibrium: I. Observable Actions , 2001, J. Econ. Theory.

[39]  Tamer Basar,et al.  Common Information based Markov Perfect Equilibria for Linear-Gaussian Games with Asymmetric Information , 2014, SIAM J. Control. Optim..

[40]  Fabien Gensbittel,et al.  The Value of Markov Chain Games with Incomplete Information on Both Sides , 2012, Math. Oper. Res..

[41]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[42]  Jérôme Renault,et al.  The Value of Repeated Games with an Informed Controller , 2008, Math. Oper. Res..

[43]  Aditya Mahajan,et al.  Decentralized stochastic control , 2013, Ann. Oper. Res..

[44]  Hamidreza Tavafoghi Jahromi,et al.  On Design and Analysis of Cyber-Physical Systems with Strategic Agents , 2017 .

[45]  Ashutosh Nayyar,et al.  Structural results and explicit solution for two-player LQG systems on a finite time horizon , 2013, 52nd IEEE Conference on Decision and Control.

[46]  Yi Ouyang,et al.  Optimal Local and Remote Controllers With Unreliable Uplink Channels , 2019, IEEE Transactions on Automatic Control.

[47]  Ashutosh Nayyar,et al.  Optimal Control Strategies in Delayed Sharing Information Structures , 2010, IEEE Transactions on Automatic Control.

[48]  Y. Ho,et al.  Team decision theory and information structures in optimal control problems: Part II , 1971, CDC 1971.

[49]  H. Witsenhausen Separation of estimation and control for discrete time systems , 1971 .

[50]  Ashutosh Nayyar,et al.  On the Structure of Real-Time Encoding and Decoding Functions in a Multiterminal Communication System , 2011, IEEE Transactions on Information Theory.

[51]  H. Witsenhausen On the structure of real-time source coders , 1979, The Bell System Technical Journal.

[52]  Abhinav Sinha,et al.  Structured perfect Bayesian equilibrium in infinite horizon dynamic games with asymmetric information , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[53]  Achilleas Anastasopoulos,et al.  Signaling equilibria for dynamic LQG games with asymmetric information , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[54]  Nuno C. Martins,et al.  Information structures in optimal decentralized control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[55]  Serdar Yüksel,et al.  Convex analysis in decentralized stochastic control and strategic measures , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[56]  P. Reny Sequential Equilibria of Multi-Stage Games with Infinite Sets of Types and Actions , 2011 .

[57]  Yi Ouyang,et al.  Stochastic teams with randomized information structures , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[58]  Juan F. Escobar,et al.  Efficiency in Games with Markovian Private Information , 2013 .

[59]  F. Forges Repeated games of incomplete information: Non-zero-sum , 1992 .

[60]  Todd P. Coleman,et al.  An Optimizer's Approach to Stochastic Control Problems With Nonclassical Information Structures , 2013, IEEE Transactions on Automatic Control.

[61]  Tamer Basar,et al.  Two-Criteria LQG Decision Problems with One-Step Delay Observation Sharing Pattern , 1978, Inf. Control..

[62]  Jérôme Renault,et al.  The Value of Markov Chain Games with Lack of Information on One Side , 2006, Math. Oper. Res..

[63]  J. Laffont Perfect Bayesian Equilibrium , 1991 .