论文信息 - Cooperative learning in multi-agent systems from intermittent measurements

Cooperative learning in multi-agent systems from intermittent measurements

Motivated by the problem of decentralized direction-tracking, we consider the general problem of cooperative learning in multi-agent systems with time-varying connectivity and intermittent measurements. We propose a distributed learning protocol capable of learning an unknown vector μ from noisy measurements made independently by autonomous nodes. Our protocol is completely distributed and able to cope with the time-varying, unpredictable, and noisy nature of inter-agent communication, and intermittent noisy measurements of μ. Our main result bounds the learning speed of our protocol in terms of the size and combinatorial features of the (time-varying) network connecting the nodes.

Naomi Ehrich Leonard | Alexander Olshevsky

[1] José M. F. Moura,et al. Consensus+innovations detection: Phase transition under communication noise , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[2] Ali H. Sayed,et al. Incremental Adaptive Strategies Over Distributed Networks , 2007, IEEE Transactions on Signal Processing.

[3] José M. F. Moura,et al. Distributing the Kalman Filter for Large-Scale Systems , 2007, IEEE Transactions on Signal Processing.

[4] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[5] K. Chung. On a Stochastic Approximation Method , 1954 .

[6] Jonathan H. Manton,et al. Coordination and Consensus of Networked Agents with Noisy Measurements: Stochastic Algorithms and Asymptotic Behavior , 2009, SIAM J. Control. Optim..

[7] Ilan Lobel,et al. BAYESIAN LEARNING IN SOCIAL NETWORKS , 2008 .

[8] Ali Jadbabaie,et al. Non-Bayesian Social Learning , 2011, Games Econ. Behav..

[9] Pooya Molavi,et al. Information Heterogeneity and the Speed of Learning in Social Networks , 2013 .

[10] Angelia Nedic,et al. Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[11] Ali H. Sayed,et al. Distributed Detection Over Adaptive Networks Using Diffusion Adaptation , 2011, IEEE Transactions on Signal Processing.

[12] Jeff S. Shamma,et al. Aspiration learning in coordination games , 2010, 49th IEEE Conference on Decision and Control (CDC).

[13] I. Couzin,et al. Effective leadership and decision-making in animal groups on the move , 2005, Nature.

[14] Jie Lin,et al. Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..

[15] Ali H. Sayed,et al. Diffusion Strategies for Distributed Kalman Filtering and Smoothing , 2010, IEEE Transactions on Automatic Control.

[16] Kunihiko Sadakane,et al. The hitting and cover times of Metropolis walks , 2010, Theor. Comput. Sci..

[17] Soummya Kar,et al. Convergence Rate Analysis of Distributed Gossip (Linear Parameter) Estimation: Fundamental Limits and Tradeoffs , 2010, IEEE Journal of Selected Topics in Signal Processing.

[18] Asuman E. Ozdaglar,et al. Convergence of rule-of-thumb learning rules in social networks , 2008, 2008 47th IEEE Conference on Decision and Control.

[19] A. Odlyzko,et al. Bounds for eigenvalues of certain stochastic matrices , 1981 .

[20] José M. F. Moura,et al. Large Deviations Performance of Consensus+Innovations Distributed Detection With Non-Gaussian Observations , 2011, IEEE Transactions on Signal Processing.

[21] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[22] Pascal Bianchi,et al. Performance of a Distributed Stochastic Approximation Algorithm , 2012, IEEE Transactions on Information Theory.

[23] Jason R. Marden,et al. Cooperative Control and Potential Games , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24] Daizhan Cheng,et al. Leader-following consensus of multi-agent systems under fixed and switching topologies , 2010, Syst. Control. Lett..

[25] Reza Olfati-Saber,et al. Distributed Kalman filtering for sensor networks , 2007, 2007 46th IEEE Conference on Decision and Control.

[26] Behrouz Touri,et al. On existence of a quadratic comparison function for random weighted averaging dynamics and its implications , 2011, IEEE Conference on Decision and Control and European Control Conference.

[27] Robert H. Crites,et al. Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[28] Ruggero Carli,et al. Distributed averaging on digital erasure networks , 2011, Autom..

[29] José M. F. Moura,et al. Distributed Detection Over Noisy Networks: Large Deviations Analysis , 2011, IEEE Transactions on Signal Processing.

[30] Angelia Nedic,et al. On stochastic gradient and subgradient methods with adaptive steplength sequences , 2011, Autom..

[31] Emilio Frazzoli,et al. On synchronous robotic networks Part I: Models, tasks and complexity notions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[32] Angelia Nedic,et al. Distributed Asynchronous Constrained Stochastic Optimization , 2011, IEEE Journal of Selected Topics in Signal Processing.

[33] A. Rantzer,et al. Distributed Kalman Filtering Using Weighted Averaging , 2006 .

[34] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[35] Tao Li,et al. Consensus control for leader-following multi-agent systems with measurement noises , 2010, J. Syst. Sci. Complex..

[36] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.

[37] Colin Torney,et al. Context-dependent interaction leads to emergent search behavior in social aggregates , 2009, Proceedings of the National Academy of Sciences.

[38] I. Couzin,et al. Social interactions, information use, and the evolution of collective migration , 2010, Proceedings of the National Academy of Sciences.

[39] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[40] Richard M. Murray,et al. Approximate distributed Kalman filtering in sensor networks with quantifiable performance , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[41] Ruggero Carli,et al. Distributed Kalman filtering based on consensus strategies , 2008, IEEE Journal on Selected Areas in Communications.

[42] K.H. Johansson,et al. Distributed and Collaborative Estimation over Wireless Sensor Networks , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[43] R. Olfati-Saber,et al. Distributed Kalman Filter with Embedded Consensus Filters , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[44] Stephen P. Boyd,et al. Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[45] Jeff S. Shamma,et al. Perturbed learning automata in potential games , 2011, IEEE Conference on Decision and Control and European Control Conference.

[46] Peter Winkler,et al. Maximum itting Time for Random Walks on Graphs , 1990, Random Struct. Algorithms.

[47] Andrea J. Goldsmith,et al. Oblivious equilibrium: An approximation to large population dynamic games with concave utility , 2009, 2009 International Conference on Game Theory for Networks.

[48] Shie Mannor,et al. Multi-agent learning for engineers , 2007, Artif. Intell..

[49] Ming Tan,et al. Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[50] Naomi Ehrich Leonard,et al. Decision versus compromise for animal groups in motion , 2011, Proceedings of the National Academy of Sciences.

[51] Prabhakar Raghavan,et al. The electrical resistance of a graph captures its commute and cover times , 2005, computational complexity.

[52] Ziyang Meng,et al. Leaderless and Leader-Following Consensus With Communication and Input Delays Under a Directed Network Topology , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[53] Bikramjit Banerjee,et al. Efficient No-Regret Multiagent Learning , 2005, AAAI.

[54] Maria-Florina Balcan,et al. Game couplings: Learning dynamics and applications , 2011, IEEE Conference on Decision and Control and European Control Conference.

[55] Adam Wierman,et al. An architectural view of game theoretic control , 2011, SIGMETRICS Perform. Evaluation Rev..

[56] Ulrike von Luxburg,et al. Hitting and commute times in large graphs are often misleading , 2010, 1003.1266.

[57] Soummya Kar,et al. Distributed Kalman Filtering : Weak Consensus Under Weak Detectability , 2011 .

[58] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[59] John N. Tsitsiklis,et al. On Distributed Averaging Algorithms and Quantization Effects , 2008, IEEE Trans. Autom. Control..

[60] Ali H. Sayed,et al. Diffusion Adaptation over Networks , 2012, ArXiv.

[61] José M. F. Moura,et al. Distributed Detection via Gaussian Running Consensus: Large Deviations Asymptotic Analysis , 2011, IEEE Transactions on Signal Processing.

[62] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[63] José M. F. Moura,et al. Fast cooperative distributed learning , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).