Multiagent Reinforcement Learning: Rollout and Policy Iteration
暂无分享,去创建一个
[1] Stephanie Gil,et al. Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems , 2020, CoRL.
[2] Ioannis Ch. Paschalidis,et al. Learning parametric policies and transition probability models of markov decision processes from data , 2020, Eur. J. Control.
[3] Giovanni Russo,et al. On a probabilistic approach to synthesize control policies from example datasets , 2020, Autom..
[4] Dimitri Bertsekas,et al. Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning , 2020, Results in Control and Optimization.
[5] Abhishek Gupta,et al. Existence of Team-Optimal Solutions in Static Teams with Common Information: A Topology of Information Approach , 2020, SIAM J. Control. Optim..
[6] D. Bertsekas. Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm , 2020, ArXiv.
[7] Stephanie Gil,et al. Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration With Application to Autonomous Sequential Repair Problems , 2020, IEEE Robotics and Automation Letters.
[8] Na Li,et al. Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach , 2019, IEEE Transactions on Automatic Control.
[9] Thomas Parisini,et al. Neural Approximations for Optimal Control and Decision , 2019 .
[10] T. Başar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[11] D. Bertsekas. Multiagent Rollout Algorithms and Reinforcement Learning , 2019, ArXiv.
[12] Guannan Qu,et al. Exploiting Fast Decaying and Locality in Multi-Agent MDP with Tree Dependence Structure , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).
[13] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2018, Autonomous Agents and Multi-Agent Systems.
[14] Afshin Oroojlooyjadid,et al. A review of cooperative multi-agent deep reinforcement learning , 2019, Applied Intelligence.
[15] Ashutosh Nayyar,et al. Common Knowledge and Sequential Team Problems , 2019, IEEE Transactions on Automatic Control.
[16] Yan Zhang,et al. Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).
[17] Saeid Nahavandi,et al. Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications , 2018, IEEE Transactions on Cybernetics.
[18] Shimon Whiteson,et al. Multi-Agent Common Knowledge Reinforcement Learning , 2018, NeurIPS.
[19] Ying Wen,et al. Factorized Q-learning for large-scale multi-agent systems , 2018, DAI.
[20] Dimitri P. Bertsekas,et al. Feature-based aggregation and deep reinforcement learning: a survey and some new implementations , 2018, IEEE/CAA Journal of Automatica Sinica.
[21] Tamer Basar,et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.
[22] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[23] Jangwon Lee,et al. A survey of robot learning from demonstrations for Human-Robot Collaboration , 2017, ArXiv.
[24] Pablo Hernandez-Leal,et al. A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity , 2017, ArXiv.
[25] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[26] Hao Liu,et al. Learning Policies for Markov Decision Processes From Data , 2017, IEEE Transactions on Automatic Control.
[27] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[28] Jonathan P. How,et al. Graph-based Cross Entropy method for solving multi-robot decentralized POMDPs , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[29] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[30] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[31] Jan Peters,et al. Learning responsive robot behavior by imitation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[32] Dimitri P. Bertsekas,et al. Q-learning and policy iteration algorithms for stochastic shortest path problems , 2012, Annals of Operations Research.
[33] Dimitri P. Bertsekas,et al. Abstract Dynamic Programming , 2013 .
[34] Dario Bauso,et al. Team Theory and Person-by-Person Optimization with Binary Decisions , 2012, SIAM J. Control. Optim..
[35] Ashutosh Nayyar,et al. Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach , 2012, IEEE Transactions on Automatic Control.
[36] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[37] D. Bertsekas. Approximate policy iteration: a survey and some new methods , 2011 .
[38] Ioannis Ch. Paschalidis,et al. Mobile agent coordination via a distributed actor-critic algorithm , 2011, 2011 19th Mediterranean Conference on Control & Automation (MED).
[39] Dimitri P. Bertsekas,et al. Q-learning and enhanced policy iteration in discounted dynamic programming , 2010, 49th IEEE Conference on Decision and Control (CDC).
[40] Magnus Egerstedt,et al. Graph Theoretic Methods in Multiagent Networks , 2010, Princeton Series in Applied Mathematics.
[41] Alessandro Lazaric,et al. Analysis of a Classification-based Policy Iteration Algorithm , 2010, ICML.
[42] Ioannis Ch. Paschalidis,et al. A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems , 2010, IEEE Transactions on Automatic Control.
[43] Jorge Cortes,et al. Distributed Control of Robotic Networks: A Mathematical Approach to Motion Coordination Algorithms , 2009 .
[44] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[45] R. Pesenti,et al. Generalized person-by-person optimization in team problems with binary decisions , 2008, 2008 American Control Conference.
[46] Christos Dimitrakakis,et al. Rollout sampling approximate policy iteration , 2008, Machine Learning.
[47] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[48] Csaba Szepesvári,et al. Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.
[49] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[50] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[51] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.
[52] Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.
[53] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[54] Gerald Tesauro,et al. On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.
[55] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[56] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[57] John N. Tsitsiklis,et al. Asynchronous stochastic approximation and Q-learning , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.
[58] Michael Athans,et al. Survey of decentralized control methods for large scale systems , 1978 .
[59] Tsuneo Yoshikawa,et al. Decomposition of Dynamic Team Decision Problems , 1977 .
[60] H. Witsenhausen. Separation of estimation and control for discrete time systems , 1971 .
[61] H. Witsenhausen. On Information Structures, Feedback and Causality , 1971 .
[62] H. Witsenhausen. A Counterexample in Stochastic Optimum Control , 1968 .
[63] R. Radner,et al. Team Decision Problems , 1962 .
[64] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[65] D. Bertsekas. Rollout, Approximate Policy Iteration, and Distributed Reinforcement Learning , 2020 .
[66] D. Bertsekas. Reinforcement Learning and Optimal ControlA Selective Overview , 2018 .
[67] Bart De Schutter,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .
[68] Dimitri P. Bertsekas,et al. Neuro-Dynamic Programming , 2009, Encyclopedia of Optimization.
[69] Nikos A. Vlassis,et al. The Cross-Entropy Method for Policy Search in Decentralized POMDPs , 2008, Informatica.
[70] Agostino Poggi,et al. Multiagent Systems , 2006, Intelligenza Artificiale.
[71] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[72] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[73] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[74] Yu-Chi Ho. Team decision theory and information structures , 1980, Proceedings of the IEEE.
[75] R. Radner,et al. Economic theory of teams , 1972 .
[76] J. Marschak,et al. Elements for a Theory of Teams , 1955 .
[77] J. Walrand,et al. Distributed Dynamic Programming , 2022 .