Cooperative learning in multi-agent systems from intermittent measurements

Motivated by the problem of decentralized direction-tracking, we consider the general problem of cooperative learning in multi-agent systems with time-varying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector μ from noisy measurements made independently by autonomous nodes. Our protocol is completely distributed and able to cope with the time-varying, unpredictable, and noisy nature of inter-agent communication, and intermittent noisy measurements of μ. Our main result bounds the learning speed of our protocol in terms of the size and combinatorial features of the (time-varying) network connecting the nodes.

[1]  José M. F. Moura,et al.  Consensus+innovations detection: Phase transition under communication noise , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[2]  Ali H. Sayed,et al.  Incremental Adaptive Strategies Over Distributed Networks , 2007, IEEE Transactions on Signal Processing.

[3]  José M. F. Moura,et al.  Distributing the Kalman Filter for Large-Scale Systems , 2007, IEEE Transactions on Signal Processing.

[4]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[5]  K. Chung On a Stochastic Approximation Method , 1954 .

[6]  Jonathan H. Manton,et al.  Coordination and Consensus of Networked Agents with Noisy Measurements: Stochastic Algorithms and Asymptotic Behavior , 2009, SIAM J. Control. Optim..

[7]  Ilan Lobel,et al.  BAYESIAN LEARNING IN SOCIAL NETWORKS , 2008 .

[8]  Ali Jadbabaie,et al.  Non-Bayesian Social Learning , 2011, Games Econ. Behav..

[9]  Pooya Molavi,et al.  Information Heterogeneity and the Speed of Learning in Social Networks , 2013 .

[10]  Angelia Nedic,et al.  Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[11]  Ali H. Sayed,et al.  Distributed Detection Over Adaptive Networks Using Diffusion Adaptation , 2011, IEEE Transactions on Signal Processing.

[12]  Jeff S. Shamma,et al.  Aspiration learning in coordination games , 2010, 49th IEEE Conference on Decision and Control (CDC).

[13]  I. Couzin,et al.  Effective leadership and decision-making in animal groups on the move , 2005, Nature.

[14]  Jie Lin,et al.  Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..

[15]  Ali H. Sayed,et al.  Diffusion Strategies for Distributed Kalman Filtering and Smoothing , 2010, IEEE Transactions on Automatic Control.

[16]  Kunihiko Sadakane,et al.  The hitting and cover times of Metropolis walks , 2010, Theor. Comput. Sci..

[17]  Soummya Kar,et al.  Convergence Rate Analysis of Distributed Gossip (Linear Parameter) Estimation: Fundamental Limits and Tradeoffs , 2010, IEEE Journal of Selected Topics in Signal Processing.

[18]  Asuman E. Ozdaglar,et al.  Convergence of rule-of-thumb learning rules in social networks , 2008, 2008 47th IEEE Conference on Decision and Control.

[19]  A. Odlyzko,et al.  Bounds for eigenvalues of certain stochastic matrices , 1981 .

[20]  José M. F. Moura,et al.  Large Deviations Performance of Consensus+Innovations Distributed Detection With Non-Gaussian Observations , 2011, IEEE Transactions on Signal Processing.

[21]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[22]  Pascal Bianchi,et al.  Performance of a Distributed Stochastic Approximation Algorithm , 2012, IEEE Transactions on Information Theory.

[23]  Jason R. Marden,et al.  Cooperative Control and Potential Games , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Daizhan Cheng,et al.  Leader-following consensus of multi-agent systems under fixed and switching topologies , 2010, Syst. Control. Lett..

[25]  Reza Olfati-Saber,et al.  Distributed Kalman filtering for sensor networks , 2007, 2007 46th IEEE Conference on Decision and Control.

[26]  Behrouz Touri,et al.  On existence of a quadratic comparison function for random weighted averaging dynamics and its implications , 2011, IEEE Conference on Decision and Control and European Control Conference.

[27]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[28]  Ruggero Carli,et al.  Distributed averaging on digital erasure networks , 2011, Autom..

[29]  José M. F. Moura,et al.  Distributed Detection Over Noisy Networks: Large Deviations Analysis , 2011, IEEE Transactions on Signal Processing.

[30]  Angelia Nedic,et al.  On stochastic gradient and subgradient methods with adaptive steplength sequences , 2011, Autom..

[31]  Emilio Frazzoli,et al.  On synchronous robotic networks Part I: Models, tasks and complexity notions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[32]  Angelia Nedic,et al.  Distributed Asynchronous Constrained Stochastic Optimization , 2011, IEEE Journal of Selected Topics in Signal Processing.

[33]  A. Rantzer,et al.  Distributed Kalman Filtering Using Weighted Averaging , 2006 .

[34]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[35]  Tao Li,et al.  Consensus control for leader-following multi-agent systems with measurement noises , 2010, J. Syst. Sci. Complex..

[36]  Michael H. Bowling,et al.  Convergence and No-Regret in Multiagent Learning , 2004, NIPS.

[37]  Colin Torney,et al.  Context-dependent interaction leads to emergent search behavior in social aggregates , 2009, Proceedings of the National Academy of Sciences.

[38]  I. Couzin,et al.  Social interactions, information use, and the evolution of collective migration , 2010, Proceedings of the National Academy of Sciences.

[39]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[40]  Richard M. Murray,et al.  Approximate distributed Kalman filtering in sensor networks with quantifiable performance , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[41]  Ruggero Carli,et al.  Distributed Kalman filtering based on consensus strategies , 2008, IEEE Journal on Selected Areas in Communications.

[42]  K.H. Johansson,et al.  Distributed and Collaborative Estimation over Wireless Sensor Networks , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[43]  R. Olfati-Saber,et al.  Distributed Kalman Filter with Embedded Consensus Filters , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[44]  Stephen P. Boyd,et al.  Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[45]  Jeff S. Shamma,et al.  Perturbed learning automata in potential games , 2011, IEEE Conference on Decision and Control and European Control Conference.

[46]  Peter Winkler,et al.  Maximum itting Time for Random Walks on Graphs , 1990, Random Struct. Algorithms.

[47]  Andrea J. Goldsmith,et al.  Oblivious equilibrium: An approximation to large population dynamic games with concave utility , 2009, 2009 International Conference on Game Theory for Networks.

[48]  Shie Mannor,et al.  Multi-agent learning for engineers , 2007, Artif. Intell..

[49]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[50]  Naomi Ehrich Leonard,et al.  Decision versus compromise for animal groups in motion , 2011, Proceedings of the National Academy of Sciences.

[51]  Prabhakar Raghavan,et al.  The electrical resistance of a graph captures its commute and cover times , 2005, computational complexity.

[52]  Ziyang Meng,et al.  Leaderless and Leader-Following Consensus With Communication and Input Delays Under a Directed Network Topology , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[53]  Bikramjit Banerjee,et al.  Efficient No-Regret Multiagent Learning , 2005, AAAI.

[54]  Maria-Florina Balcan,et al.  Game couplings: Learning dynamics and applications , 2011, IEEE Conference on Decision and Control and European Control Conference.

[55]  Adam Wierman,et al.  An architectural view of game theoretic control , 2011, SIGMETRICS Perform. Evaluation Rev..

[56]  Ulrike von Luxburg,et al.  Hitting and commute times in large graphs are often misleading , 2010, 1003.1266.

[57]  Soummya Kar,et al.  Distributed Kalman Filtering : Weak Consensus Under Weak Detectability , 2011 .

[58]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[59]  John N. Tsitsiklis,et al.  On Distributed Averaging Algorithms and Quantization Effects , 2008, IEEE Trans. Autom. Control..

[60]  Ali H. Sayed,et al.  Diffusion Adaptation over Networks , 2012, ArXiv.

[61]  José M. F. Moura,et al.  Distributed Detection via Gaussian Running Consensus: Large Deviations Asymptotic Analysis , 2011, IEEE Transactions on Signal Processing.

[62]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[63]  José M. F. Moura,et al.  Fast cooperative distributed learning , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).