Multi-armed Bandit with Additional Observations
暂无分享,去创建一个
Jinwoo Shin | Alexandre Proutière | Donggyu Yun | Yung Yi | Sumyeong Ahn | Jinwoo Shin | A. Proutière | Yung Yi | Donggyu Yun | Sumyeong Ahn
[1] Aurélien Garivier,et al. On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.
[2] Shie Mannor,et al. From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.
[3] Rémi Munos,et al. Efficient learning by implicit exploration in bandit problems with side observations , 2014, NIPS.
[4] Koby Crammer,et al. Prediction with Limited Advice and Multiarmed Bandits with Paid Observations , 2014, ICML.
[5] R. Srikant,et al. Bandits with Budgets , 2015, SIGMETRICS.
[6] Koby Crammer,et al. Open Problem: Adversarial Multiarmed Bandits with Limited Advice , 2013, COLT.
[7] Yishay Mansour,et al. Bayesian Incentive-Compatible Bandit Exploration , 2018 .
[8] Noga Alon,et al. From Bandits to Experts: A Tale of Domination and Independence , 2013, NIPS.
[9] Deepak S. Turaga,et al. Budgeted Prediction with Expert Advice , 2015, AAAI.
[10] Tao Qin,et al. Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising , 2013, NIPS.
[11] Marc Lelarge,et al. Leveraging Side Observations in Stochastic Bandits , 2012, UAI.
[12] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[13] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[14] J. J. Garcia-Luna-Aceves,et al. A practical approach to rate adaptation for multi-antenna systems , 2011, 2011 19th IEEE International Conference on Network Protocols.
[15] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..
[16] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[17] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[18] John C. Bicket,et al. Bit-rate selection in wireless networks , 2005 .
[19] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[20] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[21] David Haussler,et al. How to use expert advice , 1993, STOC.
[22] Alexandre Proutière,et al. Optimal Rate Sampling in 802.11 systems , 2013, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.
[23] Kevin C. Almeroth,et al. Joint rate and channel width adaptation for 802.11 MIMO wireless networks , 2013, 2013 IEEE International Conference on Sensing, Communications and Networking (SECON).
[24] Jon M. Kleinberg,et al. Incentivizing exploration , 2014, EC.
[25] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[26] Filip Radlinski,et al. Ranked bandits in metric spaces: learning diverse rankings over large document collections , 2013, J. Mach. Learn. Res..
[27] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[28] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[29] Satyen Kale,et al. Multiarmed Bandits With Limited Expert Advice , 2013, COLT.
[30] Ron Kohavi. Online Controlled Experiments: Lessons from Running A/B/n Tests for 12 Years , 2015, KDD.
[31] Songwu Lu,et al. MIMO rate adaptation in 802.11n wireless networks , 2010, MobiCom.
[32] Atilla Eryilmaz,et al. Stochastic bandits with side observations on networks , 2014, SIGMETRICS '14.
[33] Joaquin Quiñonero Candela,et al. Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine , 2010, ICML.