Optimal Thompson Sampling strategies for support-aware CVaR bandits
暂无分享,去创建一个
[1] M. Tollenaar,et al. Yield potential, yield stability and stress tolerance in maize , 2002 .
[2] D. Tasche,et al. On the coherence of expected shortfall , 2001, cond-mat/0104295.
[3] Senthold Asseng,et al. The DSSAT crop modeling ecosystem , 2019 .
[4] R. L. McCown,et al. Changing systems for supporting farmers' decisions: problems, paradigms, and prospects , 2002 .
[5] L. T. Evans,et al. Yield potential: its definition, measurement, and significance , 1999 .
[6] Philip S. Thomas,et al. Concentration Inequalities for Conditional Value at Risk , 2019, ICML.
[7] Akimichi Takemura,et al. An Asymptotically Optimal Bandit Algorithm for Bounded Support Models. , 2010, COLT 2010.
[8] Eyke Hüllermeier,et al. Qualitative Multi-Armed Bandits: A Quantile-Based Approach , 2015, ICML.
[9] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[10] Wouter M. Koolen,et al. Optimal Best-Arm Identification Methods for Tail-Risk Measures , 2020, NeurIPS.
[11] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .
[12] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[13] C. Berge. Topological Spaces: including a treatment of multi-valued functions , 2010 .
[14] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[15] C. W. Richardson. Wgen: A Model for Generating Daily Weather Variables , 2018 .
[16] Matthew J. Holland,et al. Learning with CVaR-based feedback under potentially heavy tails , 2020, ArXiv.
[17] B. Mandlebrot. The Variation of Certain Speculative Prices , 1963 .
[18] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[19] David B. Brown,et al. Large deviations bounds for estimating conditional value-at-risk , 2007, Oper. Res. Lett..
[20] P. Massart. The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .
[21] Krishna Jagannathan,et al. Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewards , 2019, NeurIPS.
[22] Akimichi Takemura,et al. Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards , 2015, J. Mach. Learn. Res..
[23] Michal Valko,et al. Extreme bandits , 2014, NIPS.
[24] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[25] A. Burnetas,et al. Optimal Adaptive Policies for Sequential Allocation Problems , 1996 .
[26] Emma Brunskill,et al. Distributionally-Aware Exploration for CVaR Bandits , 2019 .
[27] Junya Honda,et al. Bandit Algorithms Based on Thompson Sampling for Bounded Reward Distributions , 2020, ALT.
[28] Qing Zhao,et al. Mean-variance and value at risk in multi-armed bandit problems , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[29] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[30] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[31] Quirino Paris,et al. The Return of von Liebig's “Law of the Minimum” , 1992 .
[32] Qiuyu Zhu,et al. Thompson Sampling Algorithms for Mean-Variance Bandits , 2020, ICML.
[33] Shie Mannor,et al. A General Approach to Multi-Armed Bandits Under Risk Criteria , 2018, COLT.
[34] Odalric-Ambrym Maillard,et al. Robust Risk-Averse Stochastic Multi-armed Bandits , 2013, ALT.
[35] Qing Zhao,et al. Risk-Averse Multi-Armed Bandit Problems Under Mean-Variance Measure , 2016, IEEE Journal of Selected Topics in Signal Processing.
[36] Michèle Sebag,et al. Exploration vs Exploitation vs Safety: Risk-Aware Multi-Armed Bandits , 2013, ACML.
[37] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[38] Advances in crop modelling for a sustainable agriculture , 2019 .
[39] Krishna Jagannathan,et al. Constrained regret minimization for multi-criterion multi-armed bandits , 2020, ArXiv.
[40] Tor Lattimore,et al. A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis , 2017, NIPS.
[41] Krishnendu Chatterjee,et al. Generalized Risk-Aversion in Stochastic Multi-Armed Bandits , 2014, ArXiv.
[42] Rémi Munos,et al. Thompson Sampling for 1-Dimensional Exponential Family Bandits , 2013, NIPS.
[43] Krishna P. Jagannathan,et al. Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions , 2019, ICML.
[44] P. Carberry,et al. Emerging consensus on desirable characteristics of tools to support farmers' management of climate risk in Australia , 2011 .
[45] Byeong Ho Kang,et al. From Data to Decisions: Helping Crop Producers Build Their Actionable Knowledge , 2017 .
[46] Aurélien Garivier,et al. Explore First, Exploit Next: The True Shape of Regret in Bandit Problems , 2016, Math. Oper. Res..