暂无分享,去创建一个
Katja Hofmann | Christos Dimitrakakis | Aristide C. Y. Tossou | Aristide Tossou | Jaroslaw Rzepecki | Katja Hofmann | Christos Dimitrakakis | Jaroslaw Rzepecki
[1] Ann Nowé,et al. Designing multi-objective multi-armed bandits algorithms: A study , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).
[2] Peter Stone,et al. A polynomial-time nash equilibrium algorithm for repeated games , 2003, EC '03.
[3] J. Nash. THE BARGAINING PROBLEM , 1950, Classics in Game Theory.
[4] Michael A. Goodrich,et al. Learning To Cooperate in a Social Dilemma: A Satisficing Approach to Bargaining , 2003, ICML.
[5] Yoav Shoham,et al. Learning against opponents with bounded memory , 2005, IJCAI.
[6] Peter Stone,et al. Multiagent learning in the presence of memory-bounded agents , 2013, Autonomous Agents and Multi-Agent Systems.
[7] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[8] J. Nash. NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.
[9] Sarah Filippi,et al. Optimism in reinforcement learning and Kullback-Leibler divergence , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[10] Yoav Shoham,et al. A general criterion and an algorithmic framework for learning in multi-agent systems , 2007, Machine Learning.
[11] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[12] Ilan Adler. The equivalence of linear programs and zero-sum games , 2013, Int. J. Game Theory.
[13] Michael L. Littman,et al. A Polynomial-time Nash Equilibrium Algorithm for Repeated Stochastic Games , 2008, UAI.
[14] H. Imai. Individual Monotonicity and Lexicographic Maxmin Solution , 1983 .
[15] Dan W. Brockt,et al. The Theory of Justice , 2017 .
[16] E. Kalai. Proportional Solutions to Bargaining Situations: Interpersonal Utility Comparisons , 1977 .
[17] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[18] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[19] Walter Bossert,et al. An arbitration game and the egalitarian bargaining solution , 1995 .
[20] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[21] Michael A. Goodrich,et al. Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning , 2011, Machine Learning.
[22] Bikramjit Banerjee,et al. Performance Bounded Reinforcement Learning in Strategic Interactions , 2004, AAAI.
[23] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[24] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[25] Chen-Yu Wei,et al. Online Reinforcement Learning in Stochastic Games , 2017, NIPS.
[26] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .