Mean Field LQG Control in Leader-Follower Stochastic Multi-Agent Systems: Likelihood Ratio Based Adaptation

We study large population leader-follower stochastic multi-agent systems where the agents have linear stochastic dynamics and are coupled via their quadratic cost functions. The cost of each leader is based on a trade-off between moving toward a certain reference trajectory which is unknown to the followers and staying near their own centroid. On the other hand, followers react by tracking a convex combination of their own centroid and the centroid of the leaders. We approach this large population dynamic game problem by use of so-called Mean Field (MF) linear-quadratic-Gaussian (LQG) stochastic control theory. In this model, followers are adaptive in the sense that they use a likelihood ratio estimator (on a sample population of the leaders' trajectories) to identify the member of a given finite class of models which is generating the reference trajectory of the leaders. Under appropriate conditions, it is shown that the true reference trajectory model is identified by each follower in finite time with probability one as the leaders' population goes to infinity. Furthermore, we show that the resulting sets of mean field control laws for both leaders and adaptive followers possess an almost sure εN-Nash equilibrium property for a system with population N where εN goes to zero as N goes to infinity. Numerical experiments are presented illustrating the results.

[1]  J. Cruz,et al.  On the Stackelberg strategy in nonzero-sum games , 1973 .

[2]  Daron Acemoglu,et al.  Robust comparative statics in large static games , 2010, 49th IEEE Conference on Decision and Control (CDC).

[3]  D. Delchamps Analytic feedback control and the algebraic Riccati equation , 1984 .

[4]  P. Lions,et al.  Mean field games , 2007 .

[5]  Peter E. Caines,et al.  Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle , 2006, Commun. Inf. Syst..

[6]  P. Lions,et al.  Jeux à champ moyen. II – Horizon fini et contrôle optimal , 2006 .

[7]  I. Couzin,et al.  Effective leadership and decision-making in animal groups on the move , 2005, Nature.

[8]  P. E. Caines,et al.  A Note on the Consistency of Maximum Likelihood Estimates for Finite Families of Stochastic Processes , 1975 .

[9]  E.M. Atkins,et al.  A survey of consensus problems in multi-agent coordination , 2005, Proceedings of the 2005, American Control Conference, 2005..

[10]  P. Markowich,et al.  Boltzmann and Fokker–Planck equations modelling opinion formation in the presence of strong leaders , 2009, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[11]  P. Caines,et al.  Individual and mass behaviour in large population stochastic wireless power control problems: centralized and Nash equilibrium solutions , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[12]  Mireille E. Broucke,et al.  Local control strategies for groups of mobile autonomous agents , 2004, IEEE Transactions on Automatic Control.

[13]  Kai Lai Chung,et al.  A Course in Probability Theory , 1949 .

[14]  Sean P. Meyn,et al.  Synchronization of Coupled Oscillators is a Game , 2010, IEEE Transactions on Automatic Control.

[15]  Pedro Elosegui,et al.  Extension of the Cucker-Smale Control Law to Space Flight Formations , 2009 .

[16]  Minyi Huang,et al.  Large-Population Cost-Coupled LQG Problems With Nonuniform Agents: Individual-Mass Behavior and Decentralized $\varepsilon$-Nash Equilibria , 2007, IEEE Transactions on Automatic Control.

[17]  P. Graefe Linear stochastic systems , 1966 .

[18]  Peter E. Caines,et al.  Mean Field (NCE) Formulation of Estimation Based Leader-Follower Collective Dynamics , 2011, Int. J. Robotics Autom..

[19]  Jean-Jacques E. Slotine,et al.  A theoretical study of different leader roles in networks , 2006, IEEE Transactions on Automatic Control.

[20]  D. Helbing,et al.  Leadership, consensus decision making and collective behaviour in humans , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[21]  Tao Li,et al.  Asymptotically Optimal Decentralized Control for Large Population Stochastic Multiagent Systems , 2008, IEEE Transactions on Automatic Control.

[22]  Benjamin Van Roy,et al.  Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games , 2005, NIPS.

[23]  Graham C. Goodwin,et al.  Adaptive filtering prediction and control , 1984 .

[24]  Richard M. Murray,et al.  Information flow and cooperative control of vehicle formations , 2004, IEEE Transactions on Automatic Control.

[25]  P. Lions,et al.  Jeux à champ moyen. I – Le cas stationnaire , 2006 .

[26]  Wei Ren,et al.  Multi-vehicle consensus with a time-varying reference state , 2007, Syst. Control. Lett..

[27]  Jinjun Shan,et al.  Adaptive Synchronization Control of Multiple Spacecraft Formation Flying , 2007 .

[28]  Peter E. Caines,et al.  An Invariance Principle in Large Population Stochastic Dynamic Games , 2007, J. Syst. Sci. Complex..

[29]  Peter E. Caines,et al.  Optimality of adaption based Mean Field control laws in leader-follower stochastic collective dynamics , 2010, 49th IEEE Conference on Decision and Control (CDC).

[30]  T. Vicsek,et al.  Hierarchical group dynamics in pigeon flocks , 2010, Nature.

[31]  Dongbing Gu,et al.  Leader–Follower Flocking: Algorithms and Experiments , 2009, IEEE Transactions on Control Systems Technology.

[32]  Robert Shield,et al.  Modeling the Effect of Leadership on Crowd Flow Dynamics , 2004, ACRI.

[33]  Val E. Lambson Self-enforcing collusion in large dynamic markets , 1984 .

[34]  Fernando Paganini,et al.  IEEE Transactions on Automatic Control , 2006 .

[35]  Tyrone E. Duncan,et al.  Likelihood Functions for Stochastic Signals in White Noise , 1970, Inf. Control..

[36]  Tyrone E. Duncan,et al.  Evaluation of Likelihood Functions , 1968, Inf. Control..