Informational Design of Dynamic Multi-Agent System

This work considers a novel information design problem and studies how the craft of payoff-relevant environmental signals solely can influence the behaviors of intelligent agents. The agents’ strategic interactions are captured by an incomplete-information Markov game, in which each agent first selects one environmental signal from multiple signal sources as additional payoff-relevant information and then takes an action. There is a rational information designer (principal) who possesses one signal source and aims to control the equilibrium behaviors of the agents by designing the information structure of her signals sent to the agents. An obedient principle is established which states that it is without loss of generality to focus on the direct information design when the information design incentivizes each agent to select the signal sent by the principal, such that the design process avoids the predictions of the agents’ strategic selection behaviors. Based on the obedient principle, we introduce the design protocol given a goal of the principal referred to as obedient implementability (OIL) and study a Myersonian information design that characterizes the OIL in a class of obedient sequential Markov perfect Bayesian equilibria (O-SMPBE). A framework is proposed based on an approach which we refer to as the fixed-point alignment that incentivizes the agents to choose the signal sent by the principal, makes sure that the agents’ policy profile of taking actions is the policy component of an O-SMPBE, and the principal’s goal is achieved. The proposed approach can be applied to elicit desired behaviors of multi-agent systems in competing as well as cooperating settings and be extended to heterogeneous stochastic games in the completeand the incomplete-information environments.

[1]  Roger B. Myerson,et al.  Optimal Auction Design , 1981, Math. Oper. Res..

[2]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[3]  Richard Baskerville,et al.  Information design , 2011, Eur. J. Inf. Syst..

[4]  D. Bergemann,et al.  Information Design: A Unified Perspective , 2017, Journal of Economic Literature.

[5]  Shalabh Bhatnagar,et al.  Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games , 2015, AAMAS.

[6]  Laurent Mathevet,et al.  On Information Design in Games , 2020, Journal of Political Economy.

[7]  Quanyan Zhu,et al.  On Incentive Compatibility in Dynamic Mechanism Design With Exit Option in a Markovian Environment , 2019, Dyn. Games Appl..

[8]  Jeffrey C. Ely,et al.  Sequential Information Design , 2020, Econometrica.

[9]  Emir Kamenica,et al.  Bayesian Persuasion and Information Design , 2019, Annual Review of Economics.

[10]  M. Utku Ünver,et al.  Matching, Allocation, and Exchange of Discrete Resources , 2009 .

[11]  A. Pavan,et al.  Persuasion in Global Games with Application to Stress Testing ⇤ , 2017 .

[12]  Haifeng Xu,et al.  Information Disclosure as a Means to Security , 2015, AAMAS.

[13]  D. Duffie,et al.  Benchmarks in Search Markets , 2014 .

[14]  Matan Tsur,et al.  Information design in competitive insurance markets , 2020, J. Econ. Theory.

[15]  D. Bergemann,et al.  Bayes Correlated Equilibrium and the Comparison of Information Structures in Games , 2013 .

[16]  Quanyan Zhu,et al.  On the Differential Private Data Market: Endogenous Evolution, Dynamic Pricing, and Incentive Compatibility , 2021, ArXiv.

[17]  A. Dickinson Actions and habits: the development of behavioural autonomy , 1985 .

[18]  Tristan Tomala,et al.  Interactive Information Design , 2018, Math. Oper. Res..

[19]  M. Szydlowski Optimal Financing and Disclosure , 2016, Manag. Sci..

[20]  Wei He,et al.  Stationary Markov perfect equilibria in discounted stochastic games , 2013, J. Econ. Theory.

[21]  Anind K. Dey,et al.  Maximum Causal Entropy Correlated Equilibria for Markov Games , 2011, Interactive Decision Theory and Game Theory.

[22]  W. Fleming Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .

[23]  Isabelle Brocas,et al.  Influence through ignorance , 2007 .

[24]  Paul Milgrom,et al.  Putting Auction Theory to Work , 2004 .

[25]  K. Sonin,et al.  Government Control of the Media , 2014 .

[26]  Sanmay Das,et al.  Reducing congestion through information design , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[27]  Shalabh Bhatnagar,et al.  General-sum stochastic games: Verifiability conditions for Nash equilibria , 2012, Autom..

[28]  Jeffrey C. Ely,et al.  Suspense and Surprise , 2015, Journal of Political Economy.

[29]  Alberto Marchesi,et al.  Online Bayesian Persuasion , 2020, NeurIPS.

[30]  Sujit Gujar,et al.  An optimal bidimensional multi-armed bandit auction for multi-unit procurement , 2018, Annals of Mathematics and Artificial Intelligence.

[31]  Achyuthan Unni Krishnan,et al.  Reward Engineering for Object Pick and Place Training , 2020, ArXiv.

[32]  S. Zamir,et al.  Formulation of Bayesian analysis for games with incomplete information , 1985 .

[33]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[34]  Anca D. Dragan,et al.  Inverse Reward Design , 2017, NIPS.

[35]  Penélope Hernández,et al.  How Bayesian Persuasion Can Help Reduce Illegal Parking and Other Socially Undesirable Behavior , 2022, American Economic Journal: Microeconomics.

[36]  Miltiadis Makris,et al.  Information design in multistage games , 2018, Theoretical Economics.

[37]  Yakov Babichenko,et al.  Private Bayesian Persuasion , 2019, J. Econ. Theory.

[38]  Optimal Two-Sided Market Mechanism Design for Large-Scale Data Sharing and Trading in Massive IoT Networks , 2019, ArXiv.

[39]  Daniel Dewey,et al.  Reinforcement Learning and the Reward Engineering Principle , 2014, AAAI Spring Symposia.

[40]  Juan Pablo Xandri,et al.  Robust Conditional Predictions in Dynamic Games: An Application to Sovereign Debt , 2014 .

[41]  Itay Goldstein,et al.  Stress Tests and Information Disclosure , 2017, J. Econ. Theory.

[42]  I. Segal,et al.  Dynamic Mechanism Design: A Myersonian Approach , 2014 .

[43]  Emir Kamenica,et al.  Bayesian Persuasion , 2009 .

[44]  Jerzy A. Filar,et al.  Competitive Markov Decision Processes - Theory, Algorithms, and Applications , 1997 .