暂无分享,去创建一个
[1] Sergei Vassilvitskii,et al. WWW 2009 MADRID! Track: Internet Monetization / Session: Web Monetization Adaptive Bidding for Display Advertising ABSTRACT , 2022 .
[2] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[3] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[4] Sudipto Guha,et al. Approximation algorithms for budgeted learning problems , 2007, STOC '07.
[5] John N. Tsitsiklis,et al. The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..
[6] Archie C. Chapman,et al. ǫ – First Policies for Budget – Limited Multi-Armed Bandits Long , 2010 .
[7] John N. Tsitsiklis,et al. Introduction to linear optimization , 1997, Athena scientific optimization and computation series.
[8] Aleksandrs Slivkins,et al. Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[9] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[10] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[11] Nikhil R. Devanur,et al. Bandits with concave rewards and convex knapsacks , 2014, EC.
[12] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[13] R. Srikant,et al. Bandits with Budgets , 2015, SIGMETRICS.
[14] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[15] Nicholas R. Jennings,et al. Efficient Crowdsourcing of Unknown Experts using Multi-Armed Bandits , 2012, ECAI.
[16] Tao Qin,et al. Multi-Armed Bandit with Budget Constraint and Variable Costs , 2013, AAAI.
[17] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[18] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[19] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[20] Vijay Kumar,et al. Online learning in online auctions , 2003, SODA '03.
[21] Archie C. Chapman,et al. Knapsack Based Optimal Policies for Budget-Limited Multi-Armed Bandits , 2012, AAAI.
[22] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[23] Sébastien Bubeck. Bandits Games and Clustering Foundations , 2010 .
[24] Moshe Babaioff,et al. Dynamic Pricing with Limited Supply , 2011, ACM Trans. Economics and Comput..
[25] Nicholas R. Jennings,et al. Efficient Regret Bounds for Online Bid Optimisation in Budget-Limited Sponsored Search Auctions , 2014, UAI.
[26] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[27] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[28] Omar Besbes,et al. Blind Network Revenue Management , 2011, Oper. Res..
[29] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[30] D. Simchi-Levi,et al. Online Network Revenue Management Using Thompson Sampling , 2017 .
[31] Omar Besbes,et al. Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..
[32] Frank Thomson Leighton,et al. The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..
[33] Aleksandrs Slivkins,et al. Dynamic Ad Allocation: Bandits with Budgets , 2013, ArXiv.