Gamification of Pure Exploration for Linear Bandits
暂无分享,去创建一个
Michal Valko | Pierre M'enard | Xuedong Shang | R'emy Degenne | Michal Valko | Pierre M'enard | Rémy Degenne | Xuedong Shang | Rémy Degenne
[1] Corwin L. Atwood,et al. Optimal and Efficient Designs of Experiments , 1969 .
[2] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[3] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[4] Walter T. Federer,et al. Sequential Design of Experiments , 1967 .
[5] Masashi Sugiyama,et al. Fully adaptive algorithm for pure exploration in linear bandits , 2017, 1710.05552.
[6] J. Kiefer,et al. Optimum Designs in Regression Problems , 1959 .
[7] Yurii Nesterov,et al. Relatively Smooth Convex Optimization by First-Order Methods, and Applications , 2016, SIAM J. Optim..
[8] Oren Somekh,et al. Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.
[9] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[10] Alessandro Lazaric,et al. Best-Arm Identification in Linear Bandits , 2014, NIPS.
[11] Aurélien Garivier,et al. Explore First, Exploit Next: The True Shape of Regret in Bandit Problems , 2016, Math. Oper. Res..
[12] Wouter M. Koolen,et al. Pure Exploration with Multiple Correct Answers , 2019, NeurIPS.
[13] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[14] Lalit Jain,et al. Sequential Experimental Design for Transductive Linear Bandits , 2019, NeurIPS.
[15] Peng Sun,et al. Linear convergence of a modified Frank–Wolfe algorithm for computing minimum-volume enclosing ellipsoids , 2008, Optim. Methods Softw..
[16] Shie Mannor,et al. Action Elimination and Stopping Conditions for Reinforcement Learning , 2003, ICML.
[17] Wouter M. Koolen,et al. Non-Asymptotic Pure Exploration by Solving Games , 2019, NeurIPS.
[18] Lawrence M. Wein,et al. Best Arm Identification in Generalized Linear Bandits , 2019, Oper. Res. Lett..
[19] Eiji Takimoto,et al. Efficient Sampling Method for Monte Carlo Tree Search Problem , 2014, IEICE Trans. Inf. Syst..
[20] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[21] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[22] Aditya Gopalan,et al. Towards Optimal and Efficient Best Arm Identification in Linear Bandits , 2019, ArXiv.
[23] Alexandra Carpentier,et al. An optimal algorithm for the Thresholding Bandit Problem , 2016, ICML.
[24] Nando de Freitas,et al. On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning , 2014, AISTATS.
[25] Michal Valko,et al. Fixed-Confidence Guarantees for Bayesian Best-Arm Identification , 2019, AISTATS.
[26] Rémi Munos,et al. Pure Exploration for Multi-Armed Bandit Problems , 2008, ArXiv.
[27] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[28] Wouter M. Koolen,et al. Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling , 2018, NeurIPS.
[29] Wouter M. Koolen,et al. Structure Adaptive Algorithms for Stochastic Bandits , 2020, ICML.
[30] Jinbo Bi,et al. Active learning via transductive experimental design , 2006, ICML.
[31] Robert D. Nowak,et al. Anytime Exploration for Multi-armed Bandits using Confidence Information , 2016, ICML.
[32] Diego Klabjan,et al. Improving the Expected Improvement Algorithm , 2017, NIPS.
[33] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.
[34] Pierre Ménard,et al. Gradient Ascent for Active Exploration in Bandit Problems , 2019, ArXiv.
[35] Aurélien Garivier,et al. Optimal Best Arm Identification with Fixed Confidence , 2016, COLT.
[36] Wouter M. Koolen,et al. Follow the leader if you can, hedge if you must , 2013, J. Mach. Learn. Res..
[37] Yuan Zhou,et al. Best Arm Identification in Linear Bandits with Linear Dimension Dependency , 2018, ICML.
[38] Peter Stone,et al. Efficient Selection of Multiple Bandit Arms: Theory and Practice , 2010, ICML.
[39] Wei Chen,et al. Combinatorial Pure Exploration of Multi-Armed Bandits , 2014, NIPS.
[40] Michael Jackson,et al. Optimal Design of Experiments , 1994 .