On-line Markov Decision Processes
暂无分享,去创建一个
[1] David Haussler,et al. How to use expert advice , 1993, STOC.
[2] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.
[3] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[4] A. Blum,et al. Universal portfolios with and without transaction costs , 1997, COLT '97.
[5] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..
[6] Nimrod Megiddo,et al. How to Combine Expert (and Novice) Advice when Actions Impact the Environment? , 2003, NIPS.
[7] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[8] Avrim Blum,et al. Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.
[9] Nimrod Megiddo,et al. Exploration-Exploitation Tradeoffs for Experts Algorithms in Reactive Environments , 2004, NIPS.
[10] Yishay Mansour,et al. Experts in a Markov Decision Process , 2004, NIPS.
[11] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[12] Laurent El Ghaoui,et al. Robust Solutions to Markov Decision Problems with Uncertain Transition Matrices , 2005 .
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.