暂无分享,去创建一个
[1] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[2] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[3] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[4] Peter S. Fader,et al. Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments , 2016, Mark. Sci..
[5] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[6] Shipra Agrawal,et al. Near-Optimal Regret Bounds for Thompson Sampling , 2017, J. ACM.
[7] Amin Karbasi,et al. Regret Bounds for Batched Bandits , 2019, AAAI.
[8] Xiangyang Ji,et al. Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition , 2020, NeurIPS.
[9] Amin Karbasi,et al. Adaptivity in Adaptive Submodularity , 2019, COLT.
[10] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[11] Andreas Krause,et al. Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.
[12] Yanjun Han,et al. Batched Multi-armed Bandits Problem , 2019, NeurIPS.
[13] Sébastien Bubeck,et al. Prior-free and prior-dependent regret bounds for Thompson Sampling , 2013, 2014 48th Annual Conference on Information Sciences and Systems (CISS).
[14] Andreas Krause,et al. Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization , 2013, ICML.
[15] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[16] John Langford,et al. Contextual Bandit Algorithms with Supervised Learning Guarantees , 2010, AISTATS.
[17] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[18] Morteza Zadimoghaddam,et al. Submodular Maximization with Nearly Optimal Approximation, Adaptivity and Query Complexity , 2018, SODA.
[19] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[20] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[21] Nicolas Vayatis,et al. Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration , 2013, ECML/PKDD.
[22] Vianney Perchet,et al. Batched Bandit Problems , 2015, COLT.
[23] Long Tran-Thanh,et al. Efficient Thompson Sampling for Online Matrix-Factorization Recommendation , 2015, NIPS.
[24] Yuan Zhou,et al. Linear bandits with limited adaptivity and learning distributional optimal design , 2020, STOC.
[25] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[26] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[27] Joaquin Quiñonero Candela,et al. Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine , 2010, ICML.
[28] Xiaokui Xiao,et al. MOTS: Minimax Optimal Thompson Sampling , 2020, ArXiv.
[29] Volkan Cevher,et al. High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups , 2018, AISTATS.
[30] Amin Karbasi,et al. Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition , 2020, NeurIPS.
[31] Kirthevasan Kandasamy,et al. Parallelised Bayesian Optimisation via Thompson Sampling , 2018, AISTATS.
[32] Arpit Agarwal,et al. Stochastic Submodular Cover with Limited Adaptivity , 2019, SODA.
[33] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[34] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[35] Lalit Jain,et al. Sequential Experimental Design for Transductive Linear Bandits , 2019, NeurIPS.
[36] Eric Balkanski,et al. Parallelization does not Accelerate Convex Optimization: Adaptivity Lower Bounds for Non-smooth Convex Minimization , 2018, ArXiv.
[37] Jack Bowden,et al. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges. , 2015, Statistical science : a review journal of the Institute of Mathematical Statistics.
[38] Pushmeet Kohli,et al. Batched Gaussian Process Bandit Optimization via Determinantal Point Processes , 2016, NIPS.
[39] Rémi Munos,et al. A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences , 2011, COLT.
[40] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[41] Amin Karbasi,et al. Unconstrained submodular maximization with constant adaptive complexity , 2019, STOC.
[42] Ayfer Özgür,et al. Batched Thompson Sampling , 2021, NeurIPS.
[43] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[44] Zi Wang,et al. Batched Large-scale Bayesian Optimization in High-dimensional Spaces , 2017, AISTATS.
[45] Lalit Jain,et al. Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits , 2021, ICML.
[46] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[47] Donald A. Berry,et al. Bandit Problems: Sequential Allocation of Experiments. , 1986 .
[48] Aleksandrs Slivkins,et al. Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..
[49] Eric Balkanski,et al. The adaptive complexity of maximizing a submodular function , 2018, STOC.