暂无分享,去创建一个
Csaba Szepesvári | Yasin Abbasi-Yadkori | Dávid Pál | Csaba Szepesvari | Yasin Abbasi-Yadkori | D. Pál | Yasin Abbasi-Yadkori
[1] H. Robbins,et al. Boundary Crossing Probabilities for the Wiener Process and Sample Sums , 1970 .
[2] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[3] H. Robbins,et al. Strong consistency of least squares estimates in multiple regression. , 1979, Proceedings of the National Academy of Sciences of the United States of America.
[4] T. Lai,et al. Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .
[5] V. N. Bogaevski,et al. Matrix Perturbation Theory , 1991 .
[6] T. Lai,et al. Self-Normalized Processes: Limit Theory and Statistical Applications , 2001 .
[7] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[8] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[9] T. Lai,et al. SELF-NORMALIZED PROCESSES: EXPONENTIAL INEQUALITIES, MOMENT BOUNDS AND ITERATED LOGARITHM LAWS , 2004, math/0410102.
[10] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[11] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[12] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[13] Aurélien Garivier,et al. On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems , 2008, 0805.3415.
[14] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[15] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[16] Varun Grover,et al. Active learning in heteroscedastic noise , 2010, Theor. Comput. Sci..
[17] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .