Learning in Congestion Games with Bandit Feedback
暂无分享,去创建一个
S. Du | M. Fazel | Zhihan Xiong | Qiwen Cui | Maryam Fazel
[1] Yuejie Chi,et al. Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization , 2022, 2022 IEEE 61st Conference on Decision and Control (CDC).
[2] K. Zhang,et al. Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence , 2022, ICML.
[3] L. Ratliff,et al. Improved Rates for Derivative Free Gradient Play in Strongly Monotone Games∗ , 2021, 2022 IEEE 61st Conference on Decision and Control (CDC).
[4] Roy Fox,et al. Independent Natural Policy Gradient Always Converges in Markov Potential Games , 2021, AISTATS.
[5] Song Mei,et al. When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently? , 2021, ICLR.
[6] Chi Jin,et al. V-Learning - A Simple, Efficient, Decentralized Algorithm for Multiagent RL , 2021, ArXiv.
[7] Na Li,et al. Gradient play in stochastic games: stationary points, convergence, and sample complexity , 2021, 2106.00198.
[8] Qinghua Liu,et al. A Sharp Analysis of Model-based Reinforcement Learning with Self-Play , 2020, ICML.
[9] Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure , 2020, 2009.05986.
[10] Lihong Li,et al. Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL , 2020, ICLR.
[11] Suvrit Sra,et al. Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes , 2020, NeurIPS.
[12] Yun Kuen Cheung,et al. Chaos, Extremism and Optimism: Volume Analysis of Learning in Games , 2020, NeurIPS.
[13] Chi Jin,et al. Provable Self-Play Algorithms for Competitive Reinforcement Learning , 2020, ICML.
[14] Ambuj Tewari,et al. Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting , 2020, NeurIPS.
[15] Amin Karbasi,et al. One Sample Stochastic Frank-Wolfe , 2019, AISTATS.
[16] Sebastian Bervoets,et al. Learning with minimal information in continuous games , 2018, Theoretical Economics.
[17] Francesco Orabona. A Modern Introduction to Online Learning , 2019, ArXiv.
[18] T. Başar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[19] Anoop Cherian,et al. Game Theoretic Optimization via Gradient-based Nikaido-Isoda Function , 2019, ICML.
[20] David S. Leslie,et al. Bandit learning in concave $N$-person games , 2018, 1810.01925.
[21] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.
[22] Santiago Zazo,et al. Learning Parametric Closed-Loop Policies for Markov Potential Games , 2018, ICLR.
[23] Soummya Kar,et al. On Best-Response Dynamics in Potential Games , 2017, SIAM J. Control. Optim..
[24] Stéphane Durand,et al. Analysis of Best Response Dynamics in Potential Games. (Analyse de la meilleure dynamique de réponse dans les jeux potentiels) , 2018 .
[25] Johanne Cohen,et al. Learning with Bandit Feedback in Potential Games , 2017, NIPS.
[26] Andrew H. Kemp,et al. Congestion Control for 6LoWPAN Networks: A Game Theoretic Framework , 2017, IEEE Internet of Things Journal.
[27] Aviad Rubinstein,et al. Settling the Complexity of Computing Approximate Two-Player Nash Equilibria , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).
[28] Po-An Chen,et al. Generalized mirror descents in congestion games , 2016, Artif. Intell..
[29] Valentin Goranko,et al. The Game-Theoretic Framework , 2016 .
[30] Po-An Chen,et al. Playing Congestion Games with Bandit Feedbacks , 2015, AAMAS.
[31] Pierre Coucheney,et al. Penalty-Regulated Dynamics and Robust Learning Procedures in Games , 2013, Math. Oper. Res..
[32] Alexandre M. Bayen,et al. On the convergence of no-regret learning in selfish routing , 2014, ICML.
[33] Constantinos Daskalakis,et al. On the complexity of approximating a Nash equilibrium , 2011, SODA '11.
[34] Christian Ibars,et al. Distributed Demand Management in Smart Grid with a Congestion Game , 2010, 2010 First IEEE International Conference on Smart Grid Communications.
[35] Roberto Cominetti,et al. Author's Personal Copy Games and Economic Behavior a Payoff-based Learning Procedure and Its Application to Traffic Games , 2022 .
[36] András Lörincz,et al. Optimistic initialization and greediness lead to polynomial time learning in factored MDPs , 2009, ICML '09.
[37] John N. Tsitsiklis,et al. Efficiency loss in a network resource allocation game: the case of elastic supply , 2004, IEEE Transactions on Automatic Control.
[38] Paul G. Spirakis,et al. The structure and complexity of Nash equilibria for a selfish routing game , 2002, Theor. Comput. Sci..
[39] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.