QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations
暂无分享,去创建一个
[1] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..
[2] M.G. Rabbat,et al. Generalized consensus computation in networked systems with erasure links , 2005, IEEE 6th Workshop on Signal Processing Advances in Wireless Communications, 2005..
[3] A. Shiryaev,et al. Limit Theorems for Stochastic Processes , 1987 .
[4] Richard S. Sutton,et al. Reinforcement Learning is Direct Adaptive Optimal Control , 1992, 1991 American Control Conference.
[5] Ali H. Sayed,et al. Diffusion Least-Mean Squares Over Adaptive Networks: Formulation and Performance Analysis , 2008, IEEE Transactions on Signal Processing.
[6] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[7] Reza Olfati-Saber,et al. Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.
[8] Shin'ichi Yuta,et al. Coordinating Autonomous And Centralized Decision Making To Achieve Cooperative Behaviors Between Multiple Mobile Robots , 1992, Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.
[9] Yoav Shoham,et al. Multi-Agent Reinforcement Learning:a critical survey , 2003 .
[10] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[11] José M. F. Moura,et al. Cooperative Convex Optimization in Networked Systems: Augmented Lagrangian Algorithms With Directed Gossip Communication , 2010, IEEE Transactions on Signal Processing.
[12] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[13] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[14] John Baillieul,et al. Robust and efficient quantization and coding for control of multidimensional linear systems under data rate constraints , 2006, CDC 2006.
[15] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.
[16] John N. Tsitsiklis,et al. Asynchronous stochastic approximation and Q-learning , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.
[17] Jie Lin,et al. Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..
[18] Csaba Szepesvári,et al. The Asymptotic Convergence-Rate of Q-learning , 1997, NIPS.
[19] Stephen P. Boyd,et al. Randomized gossip algorithms , 2006, IEEE Transactions on Information Theory.
[20] Nikos A. Vlassis,et al. Non-communicative multi-robot coordination in dynamic environments , 2005, Robotics Auton. Syst..
[21] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[22] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[23] Soummya Kar,et al. Distributed Consensus Algorithms in Sensor Networks: Quantized Data and Random Link Failures , 2007, IEEE Transactions on Signal Processing.
[24] Peter Stone,et al. CMUnited: a team of robotics soccer agents collaborating in an adversarial environment , 1998, CROS.
[25] John N. Tsitsiklis,et al. On distributed averaging algorithms and quantization effects , 2007, 2008 47th IEEE Conference on Decision and Control.
[26] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[27] Richard M. Murray,et al. Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.
[28] José M. F. Moura,et al. Large deviations analysis of consensus+innovations detection in random networks , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[29] Peter Secretan. Learning , 1965, Mental Health.
[30] Hiroaki Kitano,et al. RoboCup-97: The First Robot World Cup Soccer Games and Conferences , 1998, AI Mag..
[31] Angelia Nedic,et al. Incremental Stochastic Subgradient Algorithms for Convex Optimization , 2008, SIAM J. Optim..
[32] Soummya Kar,et al. Distributed Consensus Algorithms in Sensor Networks With Imperfect Communication: Link Failures and Channel Noise , 2007, IEEE Transactions on Signal Processing.
[33] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[34] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[35] H. Vincent Poor,et al. Distributed Linear Parameter Estimation: Asymptotically Efficient Adaptive Strategies , 2011, SIAM J. Control. Optim..
[36] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.
[37] Sekhar Tatikonda,et al. Control under communication constraints , 2004, IEEE Transactions on Automatic Control.
[38] Soummya Kar,et al. Gossip Algorithms for Distributed Signal Processing , 2010, Proceedings of the IEEE.
[39] Ian A. Hiskens,et al. Achieving Controllability of Electric Loads , 2011, Proceedings of the IEEE.
[40] Michael William Newman,et al. The Laplacian spectrum of graphs , 2001 .
[41] Andrey V. Savkin,et al. The problem of state estimation via asynchronous communication channels with irregular transmission times , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).
[42] Soummya Kar,et al. Convergence Rate Analysis of Distributed Gossip (Linear Parameter) Estimation: Fundamental Limits and Tradeoffs , 2010, IEEE Journal of Selected Topics in Signal Processing.
[43] B. Mohar. THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .
[44] C.C. White,et al. Dynamic programming and stochastic control , 1978, Proceedings of the IEEE.
[45] Nikos A. Vlassis,et al. Using the Max-Plus Algorithm for Multiagent Decision Making in Coordination Graphs , 2005, BNAIC.
[46] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[47] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[48] Manuela M. Veloso,et al. Decentralized MDPs with sparse interactions , 2011, Artif. Intell..
[49] Gonzalo Mateos,et al. Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.
[50] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[51] Fan Chung,et al. Spectral Graph Theory , 1996 .
[52] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[53] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.
[54] Soummya Kar,et al. Distributed Parameter Estimation in Sensor Networks: Nonlinear Observation Models and Imperfect Communication , 2008, IEEE Transactions on Information Theory.
[55] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.