Online learning in Markov decision processes with arbitrarily changing rewards and transitions
暂无分享,去创建一个
[1] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[2] Shie Mannor,et al. The Robustness-Performance Tradeoff in Markov Decision Processes , 2006, NIPS.
[3] Yishay Mansour,et al. Experts in a Markov Decision Process , 2004, NIPS.
[4] Shie Mannor,et al. The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes , 2003, Math. Oper. Res..
[5] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..
[6] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[7] Y. Mansour,et al. On-line Markov Decision Processes , 2006 .
[8] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[9] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[10] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[11] Yishay Mansour,et al. Online Markov Decision Processes , 2009, Math. Oper. Res..
[12] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[13] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[14] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[15] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .
[16] Eitan Altman,et al. Applications of Dynamic Games in Queues , 2005 .
[17] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[18] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[19] Shie Mannor,et al. Markov Decision Processes with Arbitrary Reward Processes , 2008, Math. Oper. Res..
[20] Prakash Narayan,et al. Reliable Communication Under Channel Uncertainty , 1998, IEEE Trans. Inf. Theory.
[21] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[22] P. Schweitzer. Perturbation theory and finite Markov chains , 1968 .
[23] Philip Wolfe,et al. Contributions to the theory of games , 1953 .