论文信息 - A near-optimal polynomial time algorithm for learning in certain classes of stochastic games - 字舞流文

A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

Ronen I. Brafman | Moshe Tennenholtz | Moshe Tennenholtz | R. Brafman

[1] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[2] Moshe Tennenholtz,et al. Dynamic Non-Bayesian Decision Making , 1997, J. Artif. Intell. Res..

[3] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[4] O. J. Vrieze. Linear programming and undiscounted stochastic games in which one player controls transitions , 1981 .

[5] T. Parthasarathy,et al. An orderfield property for stochastic games when one player controls transition probabilities , 1981 .

[6] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[7] H. Chernoff. A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[8] Noga Alon,et al. The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.