Logarithmic regret bounds for Bandits with Knapsacks
暂无分享,去创建一个
[1] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[2] Omar Besbes,et al. Blind Network Revenue Management , 2011, Oper. Res..
[3] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[4] Nikhil R. Devanur,et al. An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives , 2015, COLT.
[5] Tao Qin,et al. Multi-Armed Bandit with Budget Constraint and Variable Costs , 2013, AAAI.
[6] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[7] Aleksandrs Slivkins,et al. Dynamic Ad Allocation: Bandits with Budgets , 2013, ArXiv.
[8] Deepak S. Turaga,et al. Budgeted Prediction with Expert Advice , 2015, AAAI.
[9] Nicholas R. Jennings,et al. Efficient Regret Bounds for Online Bid Optimisation in Budget-Limited Sponsored Search Auctions , 2014, UAI.
[10] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[11] John N. Tsitsiklis,et al. Introduction to linear optimization , 1997, Athena scientific optimization and computation series.
[12] D. Simchi-Levi,et al. Online Network Revenue Management Using Thompson Sampling , 2017 .
[13] Omar Besbes,et al. Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..
[14] Peter Jacko,et al. Generalized Restless Bandits and the Knapsack Problem for Perishable Inventories , 2014, Oper. Res..
[15] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[16] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[17] L. Lovász,et al. Geometric Algorithms and Combinatorial Optimization , 1981 .
[18] Archie C. Chapman,et al. Knapsack Based Optimal Policies for Budget-Limited Multi-Armed Bandits , 2012, AAAI.
[19] Sébastien Bubeck. Bandits Games and Clustering Foundations , 2010 .
[20] Moshe Babaioff,et al. Dynamic Pricing with Limited Supply , 2011, ACM Trans. Economics and Comput..
[21] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[22] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[23] Vijay Kumar,et al. Online learning in online auctions , 2003, SODA '03.
[24] R. Srikant,et al. Bandits with Budgets , 2015, SIGMETRICS.
[25] Frank Thomson Leighton,et al. The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..
[26] John N. Tsitsiklis,et al. The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..
[27] Archie C. Chapman,et al. ε-first policies for budget-limited multi-armed bandits , 2010, AAAI 2010.
[28] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[29] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[30] Marcello Restelli,et al. Budgeted Multi-Armed Bandit in Continuous Action Space , 2016, ECAI.
[31] R. Srikant,et al. Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits , 2015, NIPS.
[32] Robert D. Kleinberg,et al. Learning on a budget: posted price mechanisms for online procurement , 2012, EC '12.
[33] Aleksandrs Slivkins,et al. Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[34] Nenghai Yu,et al. Budgeted Bandit Problems with Continuous Random Costs , 2015, ACML.
[35] Nicholas R. Jennings,et al. Efficient Crowdsourcing of Unknown Experts using Multi-Armed Bandits , 2012, ECAI.
[36] Sudipto Guha,et al. Approximation algorithms for budgeted learning problems , 2007, STOC '07.
[37] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[38] Nikhil R. Devanur,et al. Bandits with concave rewards and convex knapsacks , 2014, EC.
[39] John Langford,et al. Resourceful Contextual Bandits , 2014, COLT.
[40] Archie C. Chapman,et al. ǫ – First Policies for Budget – Limited Multi-Armed Bandits Long , 2010 .
[41] Sergei Vassilvitskii,et al. WWW 2009 MADRID! Track: Internet Monetization / Session: Web Monetization Adaptive Bidding for Display Advertising ABSTRACT , 2022 .
[42] Nicholas R. Jennings,et al. Long-term information collection with energy harvesting wireless sensors: a multi-armed bandit based approach , 2012, Autonomous Agents and Multi-Agent Systems.
[43] Vianney Perchet,et al. Online learning in repeated auctions , 2015, COLT.