Shrinking the Upper Confidence Bound: A Dynamic Product Selection Problem for Urban Warehouses
暂无分享,去创建一个
Rong Jin | David Simchi-Levi | Sen Yang | Li Wang | Xinshang Wang | D. Simchi-Levi | Rong Jin | Xinshang Wang | L. Wang | Sen Yang
[1] Branislav Kveton,et al. Efficient Learning in Large-Scale Combinatorial Semi-Bandits , 2014, ICML.
[2] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[3] Xue Wang,et al. Online Learning and Decision-Making under Generalized Linear Model with High-Dimensional Data , 2018, ArXiv.
[4] Zheng Wen,et al. Large-Scale Optimistic Adaptive Submodularity , 2014, AAAI.
[5] Mohsen Bayati,et al. Online Decision-Making with High-Dimensional Covariates , 2015 .
[6] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[7] Bhaskar Krishnamachari,et al. Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations , 2010, IEEE/ACM Transactions on Networking.
[8] Yisong Yue,et al. Linear Submodular Bandits and their Application to Diversified Retrieval , 2011, NIPS.
[9] Xiaoyan Zhu,et al. Contextual Combinatorial Bandit and its Application on Diversified Online Recommendation , 2014, SDM.
[10] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Sham M. Kakade,et al. Towards Minimax Policies for Online Linear Optimization with Bandit Feedback , 2012, COLT.
[12] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[13] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[14] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[15] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[16] Aleksandrs Slivkins,et al. Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..
[17] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[18] A. Zeevi,et al. A Linear Response Bandit Problem , 2013 .
[19] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[20] Wei Chen,et al. Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.
[21] Gábor Lugosi,et al. Regret in Online Combinatorial Optimization , 2012, Math. Oper. Res..
[22] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..