暂无分享,去创建一个
Mohammad Ghavamzadeh | Aldo Pacchiano | Heinrich Jiang | Peter Bartlett | P. Bartlett | M. Ghavamzadeh | Aldo Pacchiano | Heinrich Jiang
[1] Santiago Ontañón,et al. The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games , 2013, AIIDE.
[2] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[3] Alessandro Lazaric,et al. Linear Thompson Sampling Revisited , 2016, AISTATS.
[4] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[5] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[6] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[7] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[8] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[9] R. Srikant,et al. Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits , 2015, NIPS.
[10] Nikhil R. Devanur,et al. Bandits with concave rewards and convex knapsacks , 2014, EC.
[11] Jack Bowden,et al. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges. , 2015, Statistical science : a review journal of the Institute of Mathematical Statistics.
[12] Francesca Rossi,et al. Using Contextual Bandits with Behavioral Constraints for Constrained Online Movie Recommendation , 2018, IJCAI.
[13] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[14] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[15] Robert B. Washburn,et al. Application of Multi-Armed Bandits to Sensor Management , 2008 .
[16] Nikhil R. Devanur,et al. Linear Contextual Bandits with Knapsacks , 2015, NIPS.
[17] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[18] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[19] Alessandro Lazaric,et al. Improved Algorithms for Conservative Exploration in Bandits , 2020, AAAI.
[20] Setareh Maghsudi,et al. Multi-armed bandits with application to 5G small cells , 2015, IEEE Wireless Communications.
[21] Christos Thrampoulidis,et al. Linear Stochastic Bandits Under Safety Constraints , 2019, NeurIPS.
[22] Aleksandrs Slivkins,et al. Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[23] Christos Thrampoulidis,et al. Safe Linear Thompson Sampling With Side Information , 2021, IEEE Transactions on Signal Processing.
[24] Benjamin Van Roy,et al. Conservative Contextual Linear Bandits , 2016, NIPS.
[25] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[26] John Langford,et al. Resourceful Contextual Bandits , 2014, COLT.
[27] Christos Thrampoulidis,et al. Safe Linear Thompson Sampling , 2019, ArXiv.
[28] Yifan Wu,et al. Conservative Bandits , 2016, ICML.