The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation
暂无分享,去创建一个
Haifeng Xu | David C. Parkes | Zhe Feng | D. Parkes | Haifeng Xu | Zhe Feng
[1] Anupam Gupta,et al. Better Algorithms for Stochastic Bandits with Adversarial Corruptions , 2019, COLT.
[2] Siwei Wang,et al. Multi-armed Bandits with Compensation , 2018, NeurIPS.
[3] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[4] Vianney Perchet,et al. Online learning in repeated auctions , 2015, COLT.
[5] Lihong Li,et al. Adversarial Attacks on Stochastic Bandits , 2018, NeurIPS.
[6] Zhiyuan Liu,et al. Incentivized Exploration for Multi-Armed Bandits under Reward Drift , 2020, AAAI.
[7] H. Thorisson. Coupling, stationarity, and regeneration , 2000 .
[8] Roi Livni,et al. Online Pricing with Strategic and Patient Buyers , 2016, NIPS.
[9] AgrawalShipra,et al. Near-Optimal Regret Bounds for Thompson Sampling , 2017 .
[10] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[11] Nicole Immorlica,et al. Bayesian Exploration with Heterogeneous Agents , 2019, WWW.
[12] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[13] David Simchi-Levi,et al. Learning to Optimize under Non-Stationarity , 2018, AISTATS.
[14] Zheng Wen,et al. Cascading Bandits: Learning to Rank in the Cascade Model , 2015, ICML.
[15] Omar Besbes,et al. Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with Non-Stationary Rewards , 2014, Stochastic Systems.
[16] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[17] Vijay Kumar,et al. Online learning in online auctions , 2003, SODA '03.
[18] Vincent Conitzer,et al. Complexity Results about Nash Equilibria , 2002, IJCAI.
[19] Shipra Agrawal,et al. Near-Optimal Regret Bounds for Thompson Sampling , 2017, J. ACM.
[20] Yishay Mansour,et al. Bayesian Incentive-Compatible Bandit Exploration , 2018 .
[21] Elchanan Ben-Porath. The complexity of computing a best response automaton in repeated games with mixed strategies , 1990 .
[22] S. Matthew Weinberg,et al. Multi-armed Bandit Problems with Strategic Arms , 2017, COLT.
[23] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[24] Zhe Feng,et al. Online Learning for Measuring Incentive Compatibility in Ad Auctions? , 2019, WWW.
[25] Jonathan Katz,et al. Rational Secret Sharing, Revisited , 2006, SCN.
[26] Jon M. Kleinberg,et al. Incentivizing exploration , 2014, EC.
[27] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[28] Renato Paes Leme,et al. Stochastic bandits robust to adversarial corruptions , 2018, STOC.
[29] Vasilis Syrgkanis,et al. Learning to Bid Without Knowing your Value , 2017, EC.
[30] Rómer Rosales,et al. Simple and Scalable Response Prediction for Display Advertising , 2014, ACM Trans. Intell. Syst. Technol..
[31] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..