暂无分享,去创建一个
Joel Veness | Marcus Hutter | David Silver | Kee Siong Ng | K. S. Ng | J. Veness | Marcus Hutter | David Silver
[1] Frans M. J. Willems,et al. The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.
[2] Tao Wang,et al. Bayesian sparse sampling for on-line reward optimization , 2005, ICML.
[3] Joel Veness,et al. A Monte-Carlo AIXI Approximation , 2009, J. Artif. Intell. Res..
[4] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[5] Benjamin Van Roy,et al. Universal Reinforcement Learning , 2007, IEEE Transactions on Information Theory.
[6] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[7] Dana Ron,et al. The power of amnesia: Learning probabilistic automata with variable memory length , 1996, Machine Learning.
[8] Akira Hayashi,et al. A Reinforcement Learning Algorithm in Partially Observable Environments Using Short-Term Memory , 1998, NIPS.
[9] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.
[10] Bret Hoehn,et al. Effective short-term opponent exploitation in simplified poker , 2005, Machine Learning.
[11] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[12] Marcus Hutter,et al. Universal Artificial Intellegence - Sequential Decisions Based on Algorithmic Probability , 2005, Texts in Theoretical Computer Science. An EATCS Series.
[13] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.