Sample Complexity of Risk-Averse Bandit-Arm Selection
暂无分享,去创建一个
[1] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[2] Richard S. Varga,et al. Proof of Theorem 5 , 1983 .
[3] R. Varga,et al. Proof of Theorem 4 , 1983 .
[4] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[5] Alexander Shapiro,et al. On a Class of Minimax Stochastic Programs , 2004, SIAM J. Optim..
[6] Ward Whitt,et al. Numerical inversion of probability generating functions , 1992, Oper. Res. Lett..
[7] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[8] John N. Tsitsiklis,et al. Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.
[9] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[10] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[11] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[12] Uriel G. Rothblum,et al. Risk-Sensitive and Risk-Neutral Multiarmed Bandits , 2007, Math. Oper. Res..
[13] A. Palma,et al. Risk aversion in expected intertemporal discounted utilities bandit problems , 2009 .
[14] A. Schied. Risk Measures and Robust Optimization Problems , 2006 .
[15] Alessandro Lazaric,et al. Risk-Aversion in Multi-armed Bandits , 2012, NIPS.
[16] Chaitanya Swamy,et al. An approximation scheme for stochastic linear programming and its application to stochastic integer programs , 2006, JACM.
[17] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .
[18] David B. Brown,et al. Large deviations bounds for estimating conditional value-at-risk , 2007, Oper. Res. Lett..
[19] Thomas D. Sandry,et al. Probabilistic and Randomized Methods for Design Under Uncertainty , 2007, Technometrics.
[20] Takayuki Osogami,et al. Iterated risk measures for risk-sensitive Markov decision processes with discounted cost , 2011, UAI.
[21] Dan Rosen,et al. Measuring Portfolio Risk Using Quasi Monte Carlo Methods , 1998 .
[22] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[23] R. Tyrrell Rockafellar,et al. Coherent Approaches to Risk in Optimization Under Uncertainty , 2007 .
[24] Nicolò Cesa-Bianchi,et al. Potential-Based Algorithms in On-Line Prediction and Game Theory , 2003, Machine Learning.
[25] G. Lugosi,et al. Consistency of Data-driven Histogram Methods for Density Estimation and Classification , 1996 .
[26] Jan Dhaene,et al. Comparing Approximations for Risk Measures of Sums of Nonindependent Lognormal Random Variables , 2005 .
[27] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[28] Herbert A. David,et al. Order Statistics , 2011, International Encyclopedia of Statistical Science.
[29] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[30] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[31] J. Tsitsiklis,et al. Robust, risk-sensitive, and data-driven control of markov decision processes , 2007 .
[32] Mary R. Hardy,et al. Estimating the Variance of Bootstrapped Risk Measures , 2009, ASTIN Bulletin.
[33] Bruce L. Jones,et al. Empirical Estimation of Risk Measures and Related Quantities , 2003 .
[34] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.
[35] G. Calafiore,et al. Probabilistic and Randomized Methods for Design under Uncertainty , 2006 .