Information Directed Sampling and Bandits with Heteroscedastic Noise
暂无分享,去创建一个
[1] Benjamin Van Roy,et al. An Information-Theoretic Analysis of Thompson Sampling , 2014, J. Mach. Learn. Res..
[2] Benjamin Van Roy,et al. Learning to Optimize via Information-Directed Sampling , 2014, NIPS.
[3] A. C. Aitken. IV.—On Least Squares and Linear Combination of Observations , 1936 .
[4] Ambuj Tewari,et al. On the Generalization Ability of Online Strongly Convex Programming Algorithms , 2008, NIPS.
[5] Csaba Szepesvari,et al. Online learning for linearly parametrized control problems , 2012 .
[6] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[7] Koby Crammer,et al. Linear Multi-Resource Allocation with Semi-Bandit Feedback , 2015, NIPS.
[8] Yu. V. Prokhorov. Convergence of Random Processes and Limit Theorems in Probability Theory , 1956 .
[9] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[10] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[11] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[12] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[13] Rajendra Bhatia,et al. A Better Bound on the Variance , 2000, Am. Math. Mon..
[14] A. Burnetas,et al. Optimal Adaptive Policies for Sequential Allocation Problems , 1996 .
[15] Aditya Gopalan,et al. On Kernelized Multi-armed Bandits , 2017, ICML.
[16] Michael N. Katehakis,et al. Normal Bandits of Unknown Means and Variances: Asymptotic Optimality, Finite Horizon Regret Bounds, and a Solution to an Open Problem , 2015, ArXiv.
[17] Annie Marsden,et al. Sequential Matrix Completion , 2017, ArXiv.
[18] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[19] Varun Grover,et al. Active learning in heteroscedastic noise , 2010, Theor. Comput. Sci..
[20] Tor Lattimore,et al. A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis , 2017, NIPS.
[21] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[22] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[23] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[24] Nando de Freitas,et al. Heteroscedastic Treed Bayesian Optimisation , 2014, ArXiv.
[25] Tor Lattimore,et al. The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits , 2016, AISTATS.
[26] Xiequan Fan,et al. Exponential inequalities for martingales with applications , 2013, 1311.6273.
[27] Wolfram Burgard,et al. Most likely heteroscedastic Gaussian process regression , 2007, ICML '07.
[28] Paul W. Goldberg,et al. Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.
[29] Nagarajan Natarajan,et al. Active Heteroscedastic Regression , 2017, ICML.
[30] Zi Wang,et al. Max-value Entropy Search for Efficient Bayesian Optimization , 2017, ICML.
[31] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[32] Alessandro Lazaric,et al. Linear Thompson Sampling Revisited , 2016, AISTATS.