Multi – Armed Bandit for Pricing
暂无分享,去创建一个
[1] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[2] H. Chernoff. A Note on an Inequality Involving the Normal Distribution , 1981 .
[3] G. Ryzin,et al. Optimal dynamic pricing of inventories with stochastic demand over finite horizons , 1994 .
[4] Benoît Leloup,et al. Dynamic Pricing on the Internet: Theory and Simulations , 2001, Electron. Commer. Res..
[5] Christian Schindelhauer,et al. Discrete Prediction Games with Arbitrary Feedback and Loss , 2001, COLT/EuroCOLT.
[6] Frank Thomson Leighton,et al. The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..
[7] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[8] Nicolò Cesa-Bianchi,et al. Regret Minimization Under Partial Monitoring , 2006, ITW.
[9] D. Bertsimas,et al. Working Paper , 2022 .
[10] Eric Cope. Bayesian strategies for dynamic pricing in e‐commerce , 2007 .
[11] Csaba Szepesvári,et al. Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments , 2011, COLT.
[12] Qing Zhao,et al. Dynamic Pricing under Finite Space Demand Uncertainty: A Multi-Armed Bandit with Dependent Arms , 2012, ArXiv.
[13] Dean P. Foster,et al. No Internal Regret via Neighborhood Watch , 2011, AISTATS.
[14] Csaba Szepesvári,et al. An adaptive algorithm for finite stochastic partial monitoring , 2012, ICML.
[15] A. V. den Boer,et al. Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .
[16] Csaba Szepesvári,et al. Partial Monitoring - Classification, Regret Bounds, and Algorithms , 2014, Math. Oper. Res..