Multi-armed Bandits with Cost Subsidy
暂无分享,去创建一个
[1] Samuel Daulton,et al. Thompson Sampling for Contextual Bandit Problems with Auxiliary Safety Constraints , 2019, ArXiv.
[2] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[3] Kirthevasan Kandasamy,et al. A Flexible Framework for Multi-Objective Bayesian Optimization using Random Scalarizations , 2018, UAI.
[4] Mark Huber. Nearly Optimal Bernoulli Factories for Linear Functions , 2016, Comb. Probab. Comput..
[5] Michèle Sebag,et al. Exploration vs Exploitation vs Safety: Risk-Aware Multi-Armed Bandits , 2013, ACML.
[6] Archie C. Chapman,et al. Knapsack Based Optimal Policies for Budget-Limited Multi-Armed Bandits , 2012, AAAI.
[7] Clayton Scott,et al. Top Feasible Arm Identification , 2019, AISTATS.
[8] Milton Abramowitz,et al. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .
[9] Vashist Avadhanula,et al. A Near-Optimal Exploration-Exploitation Approach for Assortment Selection , 2016, EC.
[10] Christos Thrampoulidis,et al. Generalized Linear Bandits with Safety Constraints , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Claudio Gentile,et al. Ieee Transactions on Information Theory 1 Regret Minimization for Reserve Prices in Second-price Auctions , 2022 .
[12] Steven L. Scott,et al. A modern Bayesian look at the multi-armed bandit , 2010 .
[13] Nikhil R. Devanur,et al. Bandits with concave rewards and convex knapsacks , 2014, EC.
[14] Aleksandrs Slivkins,et al. Online decision making in crowdsourcing markets: theoretical challenges , 2013, SECO.
[15] Ann Nowé,et al. Designing multi-objective multi-armed bandits algorithms: A study , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).
[16] Bernard Manderick,et al. Annealing-pareto multi-objective multi-armed bandit algorithm , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[17] Wei Chen,et al. Combinatorial Pure Exploration of Multi-Armed Bandits , 2014, NIPS.
[18] Wei Cao,et al. On Top-k Selection in Multi-Armed Bandits and Hidden Bipartite Graphs , 2015, NIPS.
[19] Yifan Wu,et al. Conservative Bandits , 2016, ICML.
[20] Ananthram Swami,et al. Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.
[21] Sanjay Shakkottai,et al. Social Learning in Multi Agent Multi Armed Bandits , 2019, Proc. ACM Meas. Anal. Comput. Syst..
[22] Shipra Agrawal,et al. Near-Optimal Regret Bounds for Thompson Sampling , 2017, J. ACM.
[23] Songwu Lu,et al. Analysis of the Reliability of a Nationwide Short Message Service , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.
[24] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[25] Songwu Lu,et al. A study of the short message service of a nationwide cellular network , 2006, IMC '06.
[26] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[27] Vashist Avadhanula,et al. Thompson Sampling for the MNL-Bandit , 2017, COLT.
[28] Robert D. Nowak,et al. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).
[29] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.
[30] Benjamin Van Roy,et al. Conservative Contextual Linear Bandits , 2016, NIPS.
[31] Aleksandrs Slivkins,et al. Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..
[32] Aleksandrs Slivkins,et al. Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[33] Jian Li,et al. Pure Exploration of Multi-armed Bandit Under Matroid Constraints , 2016, COLT.
[34] Lihong Li,et al. Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study , 2015, WWW.
[35] Nicole Immorlica,et al. Adversarial Bandits with Knapsacks , 2018, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).
[36] George L. O'Brien,et al. A Bernoulli factory , 1994, TOMC.
[37] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[38] M. Abramowitz,et al. Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables , 1966 .
[39] Marnelli Canlas,et al. A quantitative analysis of the Quality of Service of Short Message Service in the Philippines , 2010, 2010 IEEE International Conference on Communication Systems.
[40] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[41] Bernard Manderick,et al. Thompson Sampling for Multi-Objective Multi-Armed Bandits Problem , 2015, ESANN.
[42] Osunade Oluwaseyitanfunmi,et al. Route Optimization for Delivery of Short Message Service in Telecommunication Networks , 2015 .
[43] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[44] David S. Leslie,et al. Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..