Linear Multi-Resource Allocation with Semi-Bandit Feedback
暂无分享,去创建一个
[1] G. Bennett. Probability Inequalities for the Sum of Independent Random Variables , 1962 .
[2] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[3] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[4] Kristin P. Bennett,et al. Bilinear separation of two sets inn-space , 1993, Comput. Optim. Appl..
[5] T. Sowell. Is Reality Optional?: And Other Essays , 1993 .
[6] M. Habib. Probabilistic methods for algorithmic discrete mathematics , 1998 .
[7] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[8] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[9] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[10] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[11] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[12] Marek Petrik,et al. Robust Approximate Bilinear Programming for Value Function Approximation , 2011, J. Mach. Learn. Res..
[13] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[14] Csaba Szepesvári,et al. Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits , 2012, AISTATS.
[15] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[16] Koby Crammer,et al. Optimal Resource Allocation with Semi-Bandit Feedback , 2014, UAI.
[17] Wtt Wtt. Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2015 .
[18] Zheng Wen,et al. Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2014, AISTATS.
[19] Akshay Krishnamurthy,et al. Efficient Contextual Semi-Bandit Learning , 2015, ArXiv.