Sliding-Window Thompson Sampling for Non-Stationary Settings
暂无分享,去创建一个
Nicola Gatti | Marcello Restelli | Francesco Trovò | Stefano Paladino | Marcello Restelli | N. Gatti | F. Trovò | Stefano Paladino
[1] Omar Besbes,et al. Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards , 2014, NIPS.
[2] J. Eliashberg,et al. The Impact of Competitive Entry in a Developing Market Upon Dynamic Pricing Strategies , 1986 .
[3] Marcello Restelli,et al. Dealing with Interdependencies and Uncertainty in Multi-Channel Advertising Campaigns Optimization , 2019, WWW.
[4] Peter Auer,et al. Adaptively Tracking the Best Bandit Arm with an Unknown Number of Distribution Changes , 2019, COLT.
[5] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[6] Marcello Restelli,et al. Regret Minimization Algorithms for the Followers Behaviour Identification in Leadership Games , 2017, UAI.
[7] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[8] Haipeng Luo,et al. Efficient Contextual Bandits in Non-stationary Worlds , 2017, COLT.
[9] Eli Upfal,et al. Adapting to a Changing Environment: the Brownian Restless Bandits , 2008, COLT.
[10] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[11] Aurélien Garivier,et al. On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.
[12] Alexandre Proutière,et al. Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms , 2014, ICML.
[13] Aurélien Garivier,et al. On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems , 2008, 0805.3415.
[14] Nicola Gatti,et al. Driving Exploration by Maximum Distribution in Gaussian Process Bandits , 2020, AAMAS.
[15] Nicola Gatti,et al. Learning Probably Approximately Correct Maximin Strategies in Simulation-Based Games with Infinite Strategy Spaces , 2020, AAMAS.
[16] Jonathan L. Shapiro,et al. Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection , 2013, AISTATS 2013.
[17] Lilian Besson,et al. The Generalized Likelihood Ratio Test meets klUCB: an Improved Algorithm for Piece-Wise Non-Stationary Bandits , 2019, ArXiv.
[18] Michèle Sebag,et al. Multi-armed Bandit, Dynamic Environments and Meta-Bandits , 2006 .
[19] Fang Liu,et al. A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem , 2017, AAAI.
[20] Ole-Christoffer Granmo,et al. Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton , 2010, Int. J. Intell. Comput. Cybern..
[21] Marcello Restelli,et al. Targeting Optimization for Internet Advertising by Learning from Logged Bandit Feedback , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).
[22] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[23] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[24] Marcello Restelli,et al. Unimodal Thompson Sampling for Graph-Structured Arms , 2017, AAAI.
[25] H. Vincent Poor,et al. Cognitive Medium Access: Exploration, Exploitation, and Competition , 2007, IEEE Transactions on Mobile Computing.
[26] Marcello Restelli,et al. Improving multi-armed bandit algorithms in online pricing settings , 2018, Int. J. Approx. Reason..
[27] Marcello Restelli,et al. A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns , 2018, AAAI.
[28] Brendan Kitts,et al. Optimal Bidding on Keyword Auctions , 2004, Electron. Mark..
[29] Chen-Yu Wei,et al. Tracking the Best Expert in Non-stationary Stochastic Environments , 2017, NIPS.
[30] Eric Moulines,et al. On Upper-Confidence Bound Policies for Switching Bandit Problems , 2011, ALT.
[31] Raphaël Féraud,et al. The non-stationary stochastic multi-armed bandit problem , 2017, International Journal of Data Science and Analytics.
[32] P. N. Rao,et al. Clinical Resistance to STI-571 Cancer Therapy Caused by BCR-ABL Gene Mutation or Amplification , 2001, Science.
[33] Nicola Gatti,et al. Truthful learning mechanisms for multi-slot sponsored search auctions with externalities , 2012, Artif. Intell..
[34] David S. Leslie,et al. Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..
[35] Aleksandrs Slivkins,et al. Contextual Bandits with Similarity Information , 2009, COLT.
[36] Ole-Christoffer Granmo,et al. Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters , 2010, IEA/AIE.
[37] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[38] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[39] Nicola Gatti,et al. Adopting the Cascade Model in Ad Auctions: Efficiency Bounds and Truthful Algorithmic Mechanisms , 2017, J. Artif. Intell. Res..