Promoting training of multi agent systems

The problem of incentive training of multi-agent systems in the game formulation for collective decision making under uncertainty is considered. Methods of incentive training do not require a mathematical model of the environment and enable decision making directly in the training process. Markov model of stochastic game is constructed and the criteria for its solution are formulated. An iterative Q-method for solving a stochastic game based on the numerical identification of a characteristic function of a dynamic system in space of state-action is described. Players’ current gains are determined by the method of randomization of payment Q-matrix elements. Mixed player strategies are calculated using the Boltzmann method. Pure strategies are determined on the basis of discrete random distributions given by mixed player strategies. The algorithm for stochastic game solving is developed and results of computer implementation of game Q-method are analyzed.

[1]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[2]  Andrew Kloosterman Cooperation in stochastic games: a prisoner’s dilemma experiment , 2020, Experimental Economics.

[3]  Nataliia Kunanets,et al.  Game Method of Event Synchronization in Multi-agent Systems , 2019 .

[4]  Paul Scerri,et al.  Coordination of Large-Scale Multiagent Systems , 2005 .

[5]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[6]  Yuichi Yamamoto,et al.  Stochastic Games with Hidden States , 2018 .

[7]  Bor-Sen Chen,et al.  Stochastic Game Strategies and Their Applications , 2019 .

[8]  S. K. Neogy,et al.  Mathematical Programming and Game Theory , 2018 .

[9]  Iyad Abu Doush,et al.  Multi-Agent Systems - Modeling, Control, Programming, Simulations and Applications , 2011 .

[10]  Marek Kisiel-Dorohinicki,et al.  Evolutionary Multi-Agent Systems - From Inspirations to Applications , 2017, Studies in Computational Intelligence.

[11]  Quddus Chong,et al.  Multi-agent systems support for Community-Based Learning , 2003, Interact. Comput..

[12]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[13]  Petro Kravets,et al.  The control agent with fuzzy logic , 2010, 2010 Proceedings of VIth International Conference on Perspective Technologies and Methods in MEMS Design.

[14]  Jeffrey S. Rosenschein,et al.  Best-response multiagent learning in non-stationary environments , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[15]  Dong Shen,et al.  Iterative Learning Control for Multi-agent Systems Coordination , 2017 .

[16]  D. Lozovanu Pure and Mixed Stationary Nash Equilibria for Average Stochastic Positional Games , 2019, Static & Dynamic Game Theory: Foundations & Applications.

[17]  Fengqi You,et al.  A stochastic game theoretic framework for decentralized optimization of multi-stakeholder supply chains under uncertainty , 2019, Comput. Chem. Eng..

[18]  Maxim Raginsky,et al.  Approximate Nash Equilibria in Partially Observed Stochastic Games with Mean-Field Interactions , 2017, Math. Oper. Res..

[19]  Keith B. Hall,et al.  Correlated Q-Learning , 2003, ICML.

[20]  Yevhen Burov,et al.  Gaming Method of Ontology Clusterization , 2019, Webology.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Michael Ummels,et al.  Stochastic multiplayer games: theory and algorithms , 2010 .

[23]  Zhiyong Sun,et al.  Cooperative Coordination and Formation Control for Multi-agent Systems , 2018 .

[24]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[25]  Stephen C. Adams,et al.  Multi-agent Inverse Reinforcement Learning for Certain General-sum Stochastic Games , 2019, J. Artif. Intell. Res..

[26]  THE METHODOLOGY OF MULTI-AGENT SYSTEMS: A MODERN STATE AND FUTURE TRENDS , 2013 .

[27]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[28]  Yueming Cai,et al.  Dynamic Computation Offloading for Mobile Cloud Computing: A Stochastic Game-Theoretic Approach , 2019, IEEE Transactions on Mobile Computing.

[29]  Vandana Gupta,et al.  Modeling cyber-physical attacks based on stochastic game and Markov processes , 2019, Reliab. Eng. Syst. Saf..

[30]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[31]  Valeriu Ungureanu Pareto-Nash-Stackelberg Game and Control Theory: Intelligent Paradigms and Applications , 2018 .

[32]  Agostino Poggi,et al.  Multiagent Systems , 2006, Intelligenza Artificiale.

[33]  Tristan Garrec Communicating zero-sum product stochastic games , 2017, Journal of Mathematical Analysis and Applications.