Bounding Regret in Empirical Games
暂无分享,去创建一个
Long Tran-Thanh | Arunesh Sinha | Steven Jecmen | Zun Li | Long Tran-Thanh | Arunesh Sinha | Steven Jecmen | Zun Li
[1] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[2] Peter L. Bartlett,et al. Improved Learning Complexity in Combinatorial Pure Exploration Bandits , 2016, AISTATS.
[3] Wei Chen,et al. Combinatorial Pure Exploration with Continuous and Separable Reward Functions and Its Applications (Extended Version) , 2018, IJCAI.
[4] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[5] Michael P. Wellman. Methods for Empirical Game-Theoretic Analysis , 2006, AAAI.
[6] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[7] Varun Grover,et al. Active Learning in Multi-armed Bandits , 2008, ALT.
[8] Michael P. Wellman,et al. Searching for approximate equilibria in empirical games , 2008, AAMAS.
[9] Michael P. Wellman,et al. Empirical Game-Theoretic Analysis for Moving Target Defense , 2015, MTD@CCS.
[10] Michael P. Wellman,et al. A Cloaking Mechanism to Mitigate Market Manipulation , 2018, IJCAI.
[11] Ruosong Wang,et al. Nearly Optimal Sampling Algorithms for Combinatorial Pure Exploration , 2017, COLT.
[12] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[13] Michael P. Wellman,et al. Strategy exploration in empirical games , 2010, AAMAS.
[14] Stefano Ermon,et al. Adaptive Concentration Inequalities for Sequential Decision Problems , 2016, NIPS.
[15] Stefano Ermon,et al. Best arm identification in multi-armed bandits with delayed feedback , 2018, AISTATS.
[16] Oren Somekh,et al. Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.
[17] Wei Chen,et al. Combinatorial Pure Exploration of Multi-Armed Bandits , 2014, NIPS.
[18] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.
[19] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[20] Jun Zhu,et al. Identify the Nash Equilibrium in Static Games with Random Payoffs , 2017, ICML.
[21] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[22] Joel Z. Leibo,et al. A Generalised Method for Empirical Game Theoretic Analysis , 2018, AAMAS.
[23] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[24] Varun Grover,et al. Active learning in heteroscedastic noise , 2010, Theor. Comput. Sci..
[25] Alessandro Lazaric,et al. Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits , 2011, ALT.
[26] Alessandro Lazaric,et al. Multi-Bandit Best Arm Identification , 2011, NIPS.
[27] Csaba Szepesvári,et al. Empirical Bernstein stopping , 2008, ICML '08.
[28] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[29] Michael P. Wellman,et al. Strategic Agent-Based Modeling of Financial Markets , 2017, RSF.
[30] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[31] Robert D. Nowak,et al. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).