暂无分享,去创建一个
[1] Tie-Yan Liu,et al. Joint optimization of bid and budget allocation in sponsored search , 2012, KDD.
[2] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[3] Josef Hadar,et al. Rules for Ordering Uncertain Prospects , 1969 .
[4] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[5] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[6] M. de Rijke,et al. BubbleRank: Safe Online Learning to Rerank , 2018, ArXiv.
[7] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[8] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[9] Alan Slomson. Introduction to Combinatorics , 1997 .
[10] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[11] V. Bawa. OPTIMAL, RULES FOR ORDERING UNCERTAIN PROSPECTS+ , 1975 .
[12] David Liau,et al. Stochastic Multi-armed Bandits in Constant Space , 2017, AISTATS.
[13] Wei Chen,et al. Combinatorial Partial Monitoring Game with Linear Feedback and Its Applications , 2014, ICML.
[14] Marcello Restelli,et al. A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns , 2018, AAAI.
[15] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[16] Zheng Wen,et al. Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2014, AISTATS.
[17] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[18] Wei Chen,et al. Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.
[19] Thomas P. Hayes,et al. The Price of Bandit Information for Online Optimization , 2007, NIPS.
[20] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[21] Gábor Lugosi,et al. Regret in Online Combinatorial Optimization , 2012, Math. Oper. Res..
[22] O. Krafft,et al. A Note on Hoeffding's Inequality , 1969 .
[23] Zheng Wen,et al. Cascading Bandits: Learning to Rank in the Cascade Model , 2015, ICML.
[24] Yi Gai,et al. Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).