Bandits with Budgets
暂无分享,去创建一个
[1] T. Lai. Adaptive treatment allocation and the multi-armed bandit problem , 1987 .
[2] Archie C. Chapman,et al. Epsilon-First Policies for Budget-Limited Multi-Armed Bandits , 2010, AAAI.
[3] Josef Broder,et al. Dynamic Pricing Under a General Parametric Choice Model , 2012, Oper. Res..
[4] Tamás Linder,et al. The On-Line Shortest Path Problem Under Partial Monitoring , 2007, J. Mach. Learn. Res..
[5] Robert D. Kleinberg,et al. Regret bounds for sleeping experts and bandits , 2010, Machine Learning.
[6] Aleksandrs Slivkins,et al. Dynamic Ad Allocation: Bandits with Budgets , 2013, ArXiv.
[7] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[8] Aleksandrs Slivkins,et al. Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[9] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.
[10] Shie Mannor,et al. Unimodal Bandits , 2011, ICML.
[11] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.
[12] Eric W. Cope,et al. Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems , 2009, IEEE Transactions on Automatic Control.
[13] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[14] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[15] Archie C. Chapman,et al. Knapsack Based Optimal Policies for Budget-Limited Multi-Armed Bandits , 2012, AAAI.
[16] Aurélien Garivier,et al. Informational confidence bounds for self-normalized averages and applications , 2013, 2013 IEEE Information Theory Workshop (ITW).
[17] Archie C. Chapman,et al. ε-first policies for budget-limited multi-armed bandits , 2010, AAAI 2010.
[18] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[19] Vianney Perchet,et al. Bounded regret in stochastic multi-armed bandits , 2013, COLT.
[20] Filip Radlinski,et al. Ranked bandits in metric spaces: learning diverse rankings over large document collections , 2013, J. Mach. Learn. Res..
[21] Alexandre Proutière,et al. Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms , 2014, ICML.
[22] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[23] T. Kailath. The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .
[24] Tamás Linder,et al. The Shortest Path Problem Under Partial Monitoring , 2006, COLT.
[25] Alexandre Proutière,et al. Optimal Rate Sampling in 802.11 systems , 2013, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.
[26] R. Srikant,et al. Bandits with budgets , 2013, 52nd IEEE Conference on Decision and Control.
[27] H. Wynn,et al. Algebraic and Geometric Methods in Statistics: Introduction to non-parametric estimation , 2009 .
[28] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .