An Empirical Evaluation of Thompson Sampling
暂无分享,去创建一个
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[3] J. Sarkar. One-Armed Bandit Problems with Covariates , 1991 .
[4] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[5] J. Langford. Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..
[6] Deepak Agarwal,et al. Online Models for Content Optimization , 2008, NIPS.
[7] Kilian Q. Weinberger,et al. Feature hashing for large scale multitask learning , 2009, ICML '09.
[8] Joaquin Quiñonero Candela,et al. Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine , 2010, ICML.
[9] Lihong Li,et al. Learning from Logged Implicit Exploration Data , 2010, NIPS.
[10] Rajiv Khanna,et al. Estimating rates of rare events with multiple hierarchies through scalable log-linear models , 2010, KDD '10.
[11] Ole-Christoffer Granmo,et al. Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton , 2010, Int. J. Intell. Comput. Cybern..
[12] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[13] Steven L. Scott,et al. A modern Bayesian look at the multi-armed bandit , 2010 .
[14] Benedict C. May. Simulation Studies in Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2011 .
[15] Kevin D. Glazebrook,et al. Multi-Armed Bandit Allocation Indices: Gittins/Multi-Armed Bandit Allocation Indices , 2011 .
[16] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.
[17] David S. Leslie,et al. Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..
[18] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .