A Bayesian approach to multiagent reinforcement learning and coalition formation under uncertainty
暂无分享,去创建一个
[1] L. Shapley,et al. Fictitious Play Property for Games with Identical Interests , 1996 .
[2] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[3] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[4] Anja De Waegenaere,et al. Cooperative games with stochastic payoffs , 1999, Eur. J. Oper. Res..
[5] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[6] Peter Marbach,et al. Cooperation in wireless ad hoc networks: a market-based approach , 2005, IEEE/ACM Transactions on Networking.
[7] Michael P. Wellman,et al. Multiagent Reinforcement Learning in Stochastic Games , 1999, ICML 1999.
[8] Sebastian Thrun,et al. The role of exploration in learning control , 1992 .
[9] B. Moldovanu,et al. Order independent equilibria , 1995 .
[10] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[11] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[12] Craig Boutilier,et al. Coordination in multiagent reinforcement learning: a Bayesian approach , 2003, AAMAS '03.
[13] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[14] Matthias Klusch,et al. Dynamic Coalition Formation among Rational Agents , 2002, IEEE Intell. Syst..
[15] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[16] G. Stengle. A nullstellensatz and a positivstellensatz in semialgebraic geometry , 1974 .
[17] John J. Grefenstette,et al. Credit assignment in rule discovery systems based on genetic algorithms , 1988, Machine Learning.
[18] Ulrich Schwalbe,et al. Dynamic Coalition Formation and the Core , 2002 .
[19] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[20] Regret in the On-line Decision , 1997 .
[21] E. Kalai,et al. Rational Learning Leads to Nash Equilibrium , 1993 .
[22] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[23] Roberto Serrano,et al. Non-cooperative implementation of the core , 1997 .
[24] Nicholas R. Jennings,et al. Coalition Structure Generation in Task-Based Settings , 2006, ECAI.
[25] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[26] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.
[27] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[28] Matthias Klusch,et al. Fuzzy kernel-stable coalitions between rational agents , 2003, AAMAS '03.
[29] E. M. Wright,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[30] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[31] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[32] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[33] P. Borm,et al. Stochastic Cooperative Games: Superadditivity, Convexity, and Certainty Equivalents , 1999 .
[34] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[35] Sarit Kraus,et al. The advantages of compromising in coalition formation with incomplete information , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[36] Manuela M. Veloso,et al. Convergence of Gradient Dynamics with a Variable Learning Rate , 2001, ICML.
[37] Tao Wang,et al. Bayesian sparse sampling for on-line reward optimization , 2005, ICML.
[38] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[39] Tucker Balch,et al. Learning Roles: Behavioral Diversity in Robot Teams , 1997 .
[40] Anatol Rapoport,et al. Theories of Coalition Formation , 1998 .
[41] Arie Tamir,et al. On the core of network synthesis games , 1991, Math. Program..
[42] Nicholas R. Jennings,et al. Overlapping Coalition Formation for Efficient Data Fusion in Multi-Sensor Networks , 2006, AAAI.
[43] Judy Goldsmith,et al. Nonapproximability Results for Partially Observable Markov Decision Processes , 2011, Universität Trier, Mathematik/Informatik, Forschungsbericht.
[44] J K Goeree,et al. Stochastic game theory: for playing games, not just for doing theory. , 1999, Proceedings of the National Academy of Sciences of the United States of America.
[45] J. J. Martin. Bayesian Decision Problems and Markov Chains , 1967 .
[46] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[47] J. Nash. Two-Person Cooperative Games , 1953 .
[48] Howard Raiffa,et al. Games And Decisions , 1958 .
[49] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[50] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[51] Katia P. Sycara,et al. Algorithm for combinatorial coalition formation and payoff division in an electronic marketplace , 2002, AAMAS '02.
[52] Christos H. Papadimitriou,et al. Worst-case equilibria , 1999 .
[53] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[54] Stuart J. Russell,et al. Do the right thing , 1991 .
[55] W. Hamilton,et al. The evolution of cooperation. , 1984, Science.
[56] Somesh Jha,et al. Multi-Agent Coordination through Coalition Formation , 1997, ATAL.
[57] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[58] Steven Willmott,et al. Modelling coalition formation over time for iterative coalition games , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[59] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[60] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[61] Herbert Gintis,et al. Game Theory Evolving: A Problem-Centered Introduction to Modeling Strategic Interaction - Second Edition , 2009 .
[62] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[63] Eric Joel Hovitz. Computation and action under bounded resources , 1991 .
[64] S. Basu,et al. Algorithms in real algebraic geometry , 2003 .
[65] Lloyd S. Shapley,et al. On balanced sets and cores , 1967 .
[66] Morton D. Davis,et al. The kernel of a cooperative game , 1965 .
[67] Janusz S. Kowalik,et al. Iterative methods for nonlinear optimization problems , 1972 .
[68] S. Hart,et al. Bargaining and Value , 1996 .
[69] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[70] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[71] Akira Okada. A Noncooperative Coalitional Bargaining Game with Random Proposers , 1996 .
[72] I. Grossmann. Review of Nonlinear Mixed-Integer and Disjunctive Programming Techniques , 2002 .
[73] R. Stearns. Convergent transfer schemes for $N$-person games , 1968 .
[74] H P Young,et al. On the impossibility of predicting the behavior of rational agents , 2001, Proceedings of the National Academy of Sciences of the United States of America.
[75] Maja J. Matarić,et al. Leaning to behave socially , 1994 .
[76] Katia P. Sycara,et al. Distributed Intelligent Agents , 1996, IEEE Expert.
[77] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[78] Victor R. Lesser,et al. Coalitions Among Computationally Bounded Agents , 1997, Artif. Intell..
[79] Yishay Mansour,et al. Fast Planning in Stochastic Games , 2000, UAI.
[80] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[81] J. K. Satia,et al. Markovian Decision Processes with Uncertain Transition Probabilities , 1973, Oper. Res..
[82] John Nachbar. Prediction, optimization, and learning in repeated games , 1997 .
[83] Roger B. Myerson,et al. Game theory - Analysis of Conflict , 1991 .
[84] Marco Wiering,et al. Explorations in efficient reinforcement learning , 1999 .
[85] Huibin Yan,et al. Noncooperative selection of the core , 2003, Int. J. Game Theory.
[86] Victor R. Lesser,et al. Organization-based coalition formation , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[87] Katia P. Sycara,et al. A stable and efficient buyer coalition formation scheme for e-marketplaces , 2001, AGENTS '01.
[88] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[89] Pascal Poupart,et al. Bayesian Reputation Modeling in E-Marketplaces Sensitive to Subjectivity, Deception and Change , 2006, AAAI.
[90] Sarit Kraus,et al. Multiagent Negotiation under Time Constraints , 1995, Artif. Intell..
[91] Jeremy L. Wyatt,et al. Exploration Control in Reinforcement Learning using Optimistic Model Selection , 2001, International Conference on Machine Learning.
[92] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[93] Debraj Ray,et al. A noncooperative theory of coalitional bargaining , 1993 .
[94] Sachiyo Arai,et al. Credit assignment method for learning effective stochastic policies in uncertain domains , 2001 .
[95] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[96] Craig Boutilier,et al. A Bayesian Approach to Imitation in Reinforcement Learning , 2003, IJCAI.
[97] Jeffrey S. Rosenschein,et al. Coalition, Cryptography, and Stability: Mechanisms for Coalition Formation in Task Oriented Domains , 2018, AAAI.
[98] Amy Greenwald,et al. A General Class of No-Regret Learning Algorithms and Game-Theoretic Equilibria , 2003, COLT.
[99] Katia P. Sycara,et al. Coordination of Multiple Intelligent Software Agents , 1996, Int. J. Cooperative Inf. Syst..
[100] Onn Shehory,et al. Coalition structure generation with worst case guarantees , 2022 .
[101] Michael Wooldridge,et al. Understanding the Emergence of Conventions in Multi-Agent Systems , 1995, ICMAS.
[102] David Carmel,et al. Learning Models of Intelligent Agents , 1996, AAAI/IAAI, Vol. 1.
[103] M. Degroot. Optimal Statistical Decisions , 1970 .
[104] Katia P. Sycara,et al. Mechanisms for coalition formation and cost sharing in an electronic marketplace , 2003, ICEC '03.
[105] J. Friedman. Game theory with applications to economics , 1986 .
[106] Eric Allender,et al. Complexity of finite-horizon Markov decision process problems , 2000, JACM.
[107] Anatol Rapoport,et al. N-Person Game Theory , 1970 .
[108] Vincent Conitzer,et al. Complexity of determining nonemptiness of the core , 2003, EC '03.
[109] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[110] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[111] Nicholas R. Jennings,et al. TRAVOS: Trust and Reputation in the Context of Inaccurate Information Sources , 2006, Autonomous Agents and Multi-Agent Systems.
[112] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[113] Bikramjit Banerjee,et al. Selecting partners , 2000, AGENTS '00.
[114] Sachiyo Arai,et al. Multi-agent reinforcement learning for planning and conflict resolution in a dynamic domain , 2000, AGENTS '00.
[115] Yoav Shoham,et al. On the Agenda(s) of Research on Multi-Agent Learning , 2004, AAAI Technical Report.
[116] Murali Agastya,et al. Adaptive Play in Multiplayer Bargaining Situations , 1997 .
[117] Craig Boutilier,et al. Coalition formation under uncertainty: bargaining equilibria and the Bayesian core stability concept , 2007, AAMAS '07.
[118] Xin Li,et al. Adaptive, confidence-based multiagent negotiation strategy , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[119] Marie-Françoise Roy,et al. On the combinatorial and algebraic complexity of Quanti erEliminationS , 1994 .
[120] John Nachbar,et al. Bayesian learning in repeated games of incomplete information , 2001, Soc. Choice Welf..
[121] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.
[122] John C. Harsanyi,et al. Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .
[123] Stuart J. Russell,et al. Do the right thing - studies in limited rationality , 1991 .
[124] Moshe Tennenholtz,et al. On the Synthesis of Useful Social Laws for Artificial Agent Societies (Preliminary Report) , 1992, AAAI.
[125] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[126] Sarit Kraus,et al. Methods for Task Allocation via Agent Coalition Formation , 1998, Artif. Intell..
[127] Hiroaki Kitano,et al. RoboCup Rescue A Grand Challenge for Multiagent and Intelligent Systems , 2001 .
[128] M. Matarić. Learning to Behave Socially , 1994 .
[129] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.
[130] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[131] Sarit Kraus,et al. Coalition formation with uncertain heterogeneous information , 2003, AAMAS '03.
[132] B. Sturmfels. SOLVING SYSTEMS OF POLYNOMIAL EQUATIONS , 2002 .
[133] John G. Kemeny,et al. Finite Markov chains , 1960 .
[134] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[135] Steven Reece,et al. Rumours and reputation: evaluating multi-dimensional trust within a decentralised reputation system , 2007, AAMAS '07.
[136] Y. Shoham,et al. Editorial: economic principles of multi-agent systems , 1997 .
[137] Debraj Ray,et al. Coalition formation as a dynamic process , 2003, J. Econ. Theory.
[138] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[139] Craig Boutilier,et al. Implicit Imitation in Multiagent Reinforcement Learning , 1999, ICML.
[140] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[141] Paul Bourgine,et al. Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty , 1999, Machine Learning.
[142] Xiaotie Deng,et al. On the Complexity of Cooperative Solution Concepts , 1994, Math. Oper. Res..
[143] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[144] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[145] R. Evans,et al. Coalitional Bargaining with Competition to Make Offers , 1997 .
[146] Katia P. Sycara,et al. Multi-agent learning in extensive games with complete information , 2003, AAMAS '03.
[147] C. Lee Giles,et al. Talking Helps: Evolving Communicating Agents for the Predator-Prey Pursuit Problem , 2000, Artificial Life.
[148] Craig Boutilier,et al. Coalitional Bargaining with Agent Type Uncertainty , 2007, IJCAI.
[149] Craig Boutilier,et al. Bayesian reinforcement learning for coalition formation under uncertainty , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[150] Scott Sanner,et al. Practical Linear Value-approximation Techniques for First-order MDPs , 2006, UAI.
[151] Matthias Klusch,et al. Trusted kernel-based coalition formation , 2005, AAMAS '05.
[152] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[153] A. Rubinstein. Perfect Equilibrium in a Bargaining Model , 1982 .
[154] Alun D. Preece,et al. Agent-based virtual organisations for the Grid , 2005, AAMAS '05.
[155] Alun D. Preece,et al. Agent-based formation of virtual organisations , 2004, Knowl. Based Syst..
[156] Manuela Veloso,et al. An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning , 2000 .
[157] M. Wooders,et al. Multijurisdictional economies, the tiebout hypothesis, and sorting. , 1999, Proceedings of the National Academy of Sciences of the United States of America.
[158] Nicholas R. Jennings,et al. Computational-Mechanism Design: A Call to Arms , 2003, IEEE Intell. Syst..
[159] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[160] Sarit Kraus,et al. Feasible Formation of Coalitions Among Autonomous Agents in Nonsuperadditive Environments , 1999, Comput. Intell..
[161] Vincent Conitzer,et al. Coalitional Games in Open Anonymous Environments , 2005, IJCAI.
[162] L. S. Shapley,et al. 17. A Value for n-Person Games , 1953 .
[163] Craig Boutilier,et al. Learning Conventions in Multiagent Stochastic Domains using Likelihood Estimates , 1996, UAI.
[164] Holly A. Yanco,et al. An adaptive communication protocol for cooperating mobile robots , 1993 .
[165] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[166] Dov Samet,et al. Learning to play games in extensive form by valuation , 2001, J. Econ. Theory.
[167] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[168] P. Reny,et al. A Noncooperative View of Coalition Formation and the Core , 1994 .