暂无分享,去创建一个
[1] R. Tweedie,et al. Strengthening ergodicity to geometric ergodicity for markov chains , 1994 .
[2] J.N. Tsitsiklis,et al. A structured multiarmed bandit problem and the greedy policy , 2008, 2008 47th IEEE Conference on Decision and Control.
[3] Sanjay Shakkottai,et al. Regret of Queueing Bandits , 2016, NIPS.
[4] J. Walrand,et al. The cμ rule revisited , 1985, Advances in Applied Probability.
[5] Kevin D. Glazebrook,et al. Whittle's index policy for a multi-class queueing system with convex holding costs , 2003, Math. Methods Oper. Res..
[6] Sampath Kannan,et al. A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem , 2018, NeurIPS.
[7] Demosthenis Teneketzis,et al. Multi-Armed Bandit Problems , 2008 .
[8] M. Kijima,et al. FURTHER RESULTS FOR DYNAMIC SCHEDULING OF MULTICLASS G/G/1 QUEUES , 1989 .
[9] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[10] Khashayar Khosravi,et al. Exploiting the Natural Exploration In Contextual Bandits , 2017, ArXiv.
[11] J.M. Schopf,et al. Stochastic Scheduling , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[12] Ronald J. Williams,et al. Dynamic scheduling of a system with two parallel servers in heavy traffic with resource pooling: asymptotic optimality of a threshold policy , 2001 .
[13] J. V. Mieghem. Dynamic Scheduling with Convex Delay Costs: The Generalized CU Rule , 1995 .
[14] Alexander L. Stolyar,et al. Scheduling Flexible Servers with Convex Delay Costs: Heavy-Traffic Optimality of the Generalized cµ-Rule , 2004, Oper. Res..
[15] J. Harrison. Heavy traffic analysis of a system with parallel servers: asymptotic optimality of discrete-review policies , 1998 .
[16] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[17] Demosthenis Teneketzis,et al. ON THE OPTIMALITY OF AN INDEX RULE IN MULTICHANNEL ALLOCATION FOR SINGLE-HOP MOBILE NETWORKS WITH MULTIPLE SERVICE CLASSES , 2000 .
[18] G. Klimov. Time-Sharing Service Systems. I , 1975 .
[19] J. Dedecker,et al. Subgaussian concentration inequalities for geometrically ergodic Markov chains , 2014, 1412.1794.
[20] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.
[21] B. Hajek. Hitting-time and occupation-time bounds implied by drift analysis with applications , 1982, Advances in Applied Probability.
[22] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[23] José Niño-Mora. Stochastic Scheduling , 2009, Encyclopedia of Optimization.
[24] Kevin D. Glazebrook,et al. Parallel Scheduling of Multiclass M/M/m Queues: Approximate and Heavy-Traffic Optimization of Achievable Performance , 2001, Oper. Res..