Incentivized Exploration for Multi-Armed Bandits under Reward Drift
暂无分享,去创建一个
[1] Andreas Krause,et al. Learning User Preferences to Incentivize Exploration in the Sharing Economy , 2017, AAAI.
[2] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[3] N. Nirwanto,et al. The Impact of Product Quality and Price on Customer Satisfaction with the Mediator of Customer Value , 2016 .
[4] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[5] Jon M. Kleinberg,et al. Incentivizing exploration , 2014, EC.
[6] Li Han,et al. Incentivizing Exploration with Heterogeneous Value of Money , 2015, WINE.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Yang Liu,et al. Incentivizing High Quality User Contributions: New Arm Generation in Bandit Learning , 2018, AAAI.
[9] Yi Gai,et al. Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).
[10] Shipra Agrawal,et al. Near-Optimal Regret Bounds for Thompson Sampling , 2017, J. ACM.
[11] Nicole Immorlica,et al. Bayesian Exploration with Heterogeneous Agents , 2019, WWW.
[12] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[13] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[14] Nicole Immorlica,et al. Incentivizing Exploration with Unbiased Histories , 2018, ArXiv.
[15] Alda Lopes Gançarski,et al. A Contextual-Bandit Algorithm for Mobile Context-Aware Recommender System , 2012, ICONIP.
[16] Vazifehdust Housein,et al. CUSTOMER PERCEPTIONS OF E-SERVICE QUALITY IN ONLINE SHOPPING , 2012 .
[17] D. Owen. Handbook of Mathematical Functions with Formulas , 1965 .
[18] Siwei Wang,et al. Multi-armed Bandits with Compensation , 2018, NeurIPS.
[19] K. Kristensen,et al. The drivers of customer satisfaction and loyalty: Cross-industry findings from Denmark , 2000 .
[20] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[21] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[22] Renato Paes Leme,et al. Stochastic bandits robust to adversarial corruptions , 2018, STOC.
[23] Zahra Ehsani,et al. Effect of Quality and Price on Customer Satisfaction and Commitment in Iran Auto Industry , 2015 .
[24] Lihong Li,et al. Adversarial Attacks on Stochastic Bandits , 2018, NeurIPS.
[25] Haifeng Xu,et al. The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation , 2019, ICML.
[26] Filip Radlinski,et al. Learning diverse rankings with multi-armed bandits , 2008, ICML '08.
[27] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[28] Gwo-Guang Lee,et al. Customer perceptions of e‐service quality in online shopping , 2005 .
[29] Thorsten Joachims,et al. Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.
[30] Mehryar Mohri,et al. Multi-armed Bandit Algorithms and Empirical Evaluation , 2005, ECML.
[31] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.