Stochastic Adaptive Nash Certainty Equivalence Control: Self-Identification Case

For noncooperative games the Nash Certainty Equivalence (NCE), or Mean Field (MF) methodology de- veloped in previous work provides decentralized strategies which asymptotically yield Nash equilibria. The NCE (MF) control laws use only the local information of each agent on its own state evolution and knowledge of its own dynamical parameters, while the behaviour of the mass is precomputable from knowledge of the distribution of dynamical parameters throughout the mass population. Relaxing the a priori information condition introduces the methods of parameter estimation and stochastic adaptive con- trol (SAC) into MF control theory. In particular one may consider incrementally the problems where the agents must estimate: (i) its own dynamical parameters, (ii) the distribution of the population's dynamical parameters (1), and (iii) the distribution of the population's cost function parameters (2). In this paper we treat the first problem. Each agent estimates its own dynamical parameters via the recursive weighted least squares (RWLS) algorithm. Under reasonable conditions on the population dynamical parameter distribution, we establish: (i) the strong consistency of the self- parameter estimates; and that (ii) all agent systems are long run average L 2 stable; (iii) the set of controls yields a (strong) -Nash equilibrium for all ; and (iv) in the population limit the long run average cost obtained is equal to the non-adaptive long run average cost.

[1]  P. Caines Continuous time stochastic adaptive control: non-explosion, e-consistency and stability , 1992 .

[2]  Benjamin Van Roy,et al.  Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games , 2005, NIPS.

[3]  Alain Bensoussan Stochastic Control of Partially Observable Systems: Linear filtering theory , 1992 .

[4]  Benjamin Van Roy,et al.  Markov Perfect Industry Dynamics with Many Firms , 2005 .

[5]  Lei Guo,et al.  Optimal stochastic adaptive control with quadratic index , 1986 .

[6]  Lei Guo Self-convergence of weighted least-squares with applications to stochastic adaptive control , 1996, IEEE Trans. Autom. Control..

[7]  Minyi Huang,et al.  Large-Population Cost-Coupled LQG Problems With Nonuniform Agents: Individual-Mass Behavior and Decentralized $\varepsilon$-Nash Equilibria , 2007, IEEE Transactions on Automatic Control.

[8]  H. S. Witsenhausen Alternatives to the Tree Model for Extensive Games , 1975 .

[9]  Y. Ho,et al.  Differential games and optimal pursuit-evasion strategies , 1965 .

[10]  Kevin M. Passino,et al.  Stable social foraging swarms in a noisy environment , 2004, IEEE Transactions on Automatic Control.

[11]  Lei Guo,et al.  Adaptive continuous-time linear quadratic Gaussian control , 1999, IEEE Trans. Autom. Control..

[12]  T. Başar,et al.  Iterative computation of noncooperative equilibria in nonzero-sum differential games with weakly coupled players , 1990 .

[13]  Minyi Huang,et al.  Nash Certainty Equivalence in Large Population Stochastic Dynamic Games: Connections with the Physics of Interacting Particle Systems , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[14]  P. Varaiya On the Existence of Solutions to a Differential Game , 1967 .

[15]  P. Caines,et al.  Individual and mass behaviour in large population stochastic wireless power control problems: centralized and Nash equilibrium solutions , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[16]  P. Lions,et al.  Jeux à champ moyen. I – Le cas stationnaire , 2006 .

[17]  George J. Pappas,et al.  Stable flocking of mobile agents, part I: fixed topology , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[18]  Dirk Helbing,et al.  Simulating dynamical features of escape panic , 2000, Nature.

[19]  Peter E. Caines,et al.  Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle , 2006, Commun. Inf. Syst..

[20]  Tao Li,et al.  Asymptotically Optimal Decentralized Control for Large Population Stochastic Multiagent Systems , 2008, IEEE Transactions on Automatic Control.

[21]  P. Lions,et al.  Mean field games , 2007 .

[22]  Peter E. Caines,et al.  The Nash certainty equivalence principle and McKean-Vlasov systems: An invariance principle and entry adaptation , 2007, 2007 46th IEEE Conference on Decision and Control.

[23]  P. Caines,et al.  Nash Equilibria for Large-Population Linear Stochastic Systems of Weakly Coupled Agents , 2005 .

[24]  Samuel Greengard,et al.  Following the crowd , 2011, Commun. ACM.

[25]  Benjamin Van Roy,et al.  MARKOV PERFECT INDUSTRY DYNAMICS WITH MANY FIRMS , 2008 .

[26]  Peter E. Caines,et al.  Stochastic adaptive Nash Certainty Equivalence control: Population parameter distribution estimation , 2010, 49th IEEE Conference on Decision and Control (CDC).

[27]  A. Bensoussan Perturbation Methods in Optimal Control , 1988 .

[28]  B. Bercu Weighted estimation and tracking for ARMAX models , 1992, [1992] Proceedings of the 31st IEEE Conference on Decision and Control.