Multi-armed Bandit with Additional Observations
暂无分享,去创建一个
[1] Alexandre Proutière,et al. Optimal Rate Sampling in 802.11 systems , 2013, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.
[2] R. Srikant,et al. Bandits with Budgets , 2015, SIGMETRICS.
[3] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[4] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[5] Kevin C. Almeroth,et al. Joint rate and channel width adaptation for 802.11 MIMO wireless networks , 2013, 2013 IEEE International Conference on Sensing, Communications and Networking (SECON).
[6] Filip Radlinski,et al. Ranked bandits in metric spaces: learning diverse rankings over large document collections , 2013, J. Mach. Learn. Res..
[7] Jon M. Kleinberg,et al. Incentivizing exploration , 2014, EC.
[8] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[9] Ron Kohavi. Online Controlled Experiments: Lessons from Running A/B/n Tests for 12 Years , 2015, KDD.
[10] Marc Lelarge,et al. Leveraging Side Observations in Stochastic Bandits , 2012, UAI.
[11] Satyen Kale,et al. Multiarmed Bandits With Limited Expert Advice , 2013, COLT.
[12] Shie Mannor,et al. From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.
[13] Rémi Munos,et al. Efficient learning by implicit exploration in bandit problems with side observations , 2014, NIPS.
[14] Noga Alon,et al. From Bandits to Experts: A Tale of Domination and Independence , 2013, NIPS.
[15] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[16] Koby Crammer,et al. Prediction with Limited Advice and Multiarmed Bandits with Paid Observations , 2014, ICML.
[17] J. J. Garcia-Luna-Aceves,et al. A practical approach to rate adaptation for multi-antenna systems , 2011, 2011 19th IEEE International Conference on Network Protocols.
[18] John C. Bicket,et al. Bit-rate selection in wireless networks , 2005 .
[19] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[20] Tao Qin,et al. Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising , 2013, NIPS.
[21] Songwu Lu,et al. MIMO rate adaptation in 802.11n wireless networks , 2010, MobiCom.
[22] Atilla Eryilmaz,et al. Stochastic bandits with side observations on networks , 2014, SIGMETRICS '14.
[23] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[24] Yishay Mansour,et al. Bayesian Incentive-Compatible Bandit Exploration , 2018 .
[25] Aurélien Garivier,et al. On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.
[26] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[27] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[28] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[29] Deepak S. Turaga,et al. Budgeted Prediction with Expert Advice , 2015, AAAI.
[30] Joaquin Quiñonero Candela,et al. Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine , 2010, ICML.
[31] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).