Robust Multiarmed Bandit Problems
暂无分享,去创建一个
[1] J.N. Tsitsiklis,et al. A structured multiarmed bandit problem and the greedy policy , 2008, 2008 47th IEEE Conference on Decision and Control.
[2] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[3] Arkadi Nemirovski,et al. Robust solutions of uncertain linear programs , 1999, Oper. Res. Lett..
[4] P. Whittle. Risk-sensitive linear/quadratic/gaussian control , 1981, Advances in Applied Probability.
[5] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[6] Andrew E. B. Lim,et al. Model Uncertainty, Robust Optimization and Learning , 2006 .
[7] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[8] Steven L. Scott,et al. A modern Bayesian look at the multi-armed bandit , 2010 .
[9] Martin Schneider,et al. Recursive multiple-priors , 2003, J. Econ. Theory.
[10] Laurent El Ghaoui,et al. Robust Solutions to Least-Squares Problems with Uncertain Data , 1997, SIAM J. Matrix Anal. Appl..
[11] J. Lynch,et al. A weak convergence approach to the theory of large deviations , 1997 .
[12] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[13] Larry G. Epstein,et al. Learning Under Ambiguity , 2002 .
[14] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[15] Peng Sun,et al. Information Relaxations and Duality in Stochastic Dynamic Programs , 2010, Oper. Res..
[16] Melvyn Sim,et al. The Price of Robustness , 2004, Oper. Res..
[17] Warren B. Powell,et al. The Knowledge Gradient Algorithm for a General Class of Online Learning Problems , 2012, Oper. Res..
[18] Andrew E. B. Lim,et al. Robust Portfolio Choice with Learning in the Framework of Regret: Single-Period Case , 2012, Manag. Sci..
[19] Ian R. Petersen,et al. Minimax optimal control of stochastic uncertain systems with relative entropy constraints , 2000, IEEE Trans. Autom. Control..
[20] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[21] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .
[22] José Niño-Mora,et al. Towards minimum loss job routing to parallel heterogeneous multiserver queues via index policies , 2012, Eur. J. Oper. Res..
[23] P. Whittle. A risk-sensitive maximum principle: the case of imperfect state observation , 1991 .
[24] L El Ghaoui,et al. ROBUST SOLUTIONS TO LEAST-SQUARE PROBLEMS TO UNCERTAIN DATA MATRICES , 1997 .
[25] L. C. G. Rogers,et al. Pathwise Stochastic Optimal Control , 2007, SIAM J. Control. Optim..
[26] Dimitris Bertsimas,et al. Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems , 2011, IPCO.
[27] A Ben Tal,et al. ROBUST SOLUTIONS TO UNCERTAIN PROGRAMS , 1999 .
[28] Roy H. Kwon,et al. Portfolio selection under model uncertainty: a penalized moment-based optimization approach , 2013, J. Glob. Optim..
[29] J. Tsitsiklis. A lemma on the multiarmed bandit problem , 1986 .
[30] Wolfgang J. Runggaldier,et al. Connections between stochastic control and dynamic games , 1996, Math. Control. Signals Syst..
[31] Enlu Zhou,et al. Information Relaxation and Dual Formulation of Controlled Markov Diffusions , 2013, IEEE Transactions on Automatic Control.
[32] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[33] P. Whittle. A risk-sensitive maximum principle , 1990 .
[34] Carri W. Chan,et al. Stochastic Depletion Problems: Effective Myopic Policies for a Class of Dynamic Optimization Problems , 2008, Math. Oper. Res..
[35] Daniel Kuhn,et al. Robust Markov Decision Processes , 2013, Math. Oper. Res..
[36] James E. Smith,et al. Optimal Sequential Exploration: Bandits, Clairvoyants, and Wildcats , 2013, Oper. Res..
[37] Moshe Babaioff,et al. Characterizing truthful multi-armed bandit mechanisms: extended abstract , 2008, EC '09.
[38] Rhodes,et al. Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .
[39] Onésimo Hernández-Lerma,et al. Minimax Control of Discrete-Time Stochastic Systems , 2002, SIAM J. Control. Optim..
[40] Deepayan Chakrabarti,et al. Multi-armed bandit problems with dependent arms , 2007, ICML '07.
[41] B. Efron. Bootstrap Methods: Another Look at the Jackknife , 1979 .
[42] Lars Peter Hansen,et al. Recursive Robust Estimation and Control Without Commitment , 2007, J. Econ. Theory.
[43] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[44] Lars Peter Hansen,et al. Robust Estimation and Control without Commitment , 2014 .
[45] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.
[46] David B. Shmoys,et al. Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint , 2010, Oper. Res..
[47] Andrew E. B. Lim,et al. Relative Entropy, Exponential Utility, and Robust Dynamic Pricing , 2007, Oper. Res..
[48] Andrew E. B. Lim,et al. Linear-quadratic control and information relaxations , 2012, Oper. Res. Lett..
[49] Arkadi Nemirovski,et al. Robust solutions of Linear Programming problems contaminated with uncertain data , 2000, Math. Program..
[50] P. Whittle. Multi‐Armed Bandits and the Gittins Index , 1980 .
[51] Felipe Caro,et al. Robust Control of the Multi-Armed Bandit Problem , 2014 .
[52] Lars Peter Hansen,et al. Robust estimation and control under commitment , 2005, J. Econ. Theory.
[53] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[54] Andrew E. B. Lim,et al. ROBUST ASSET ALLOCATION WITH BENCHMARKED OBJECTIVES , 2008 .
[55] Arkadi Nemirovski,et al. Robust Convex Optimization , 1998, Math. Oper. Res..
[56] Felipe Caro,et al. Dynamic Assortment with Demand Learning for Seasonal Consumer Goods , 2007, Manag. Sci..