Bayesian Learning of Recursively Factored Environments
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] F. Willems,et al. Complexity reduction of the context-tree weighting algorithm : a study for KPN Research , 1997 .
[4] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[5] Robert J. McEliece,et al. The generalized distributive law , 2000, IEEE Trans. Inf. Theory.
[6] Y. Shtarkov,et al. The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.
[7] Dr. Marcus Hutter,et al. Universal artificial intelligence , 2004 .
[8] Marcus Hutter,et al. Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability (Texts in Theoretical Computer Science. An EATCS Series) , 2006 .
[9] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[10] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[11] P. Grünwald. The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .
[12] Geoffrey E. Hinton,et al. The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.
[13] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.
[14] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.
[15] Joelle Pineau,et al. Model-Based Bayesian Reinforcement Learning in Large Structured Domains , 2008, UAI.
[16] Finale Doshi-Velez,et al. The Infinite Partially Observable Markov Decision Process , 2009, NIPS.
[17] Lihong Li,et al. The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning , 2009, ICML '09.
[18] Csaba Szepesvári,et al. Model-based and Model-free Reinforcement Learning for Visual Servoing , 2009, 2009 IEEE International Conference on Robotics and Automation.
[19] Joel Veness,et al. Reinforcement Learning via AIXI Approximation , 2010, AAAI.
[20] Yavar Naddaf,et al. Game-independent AI agents for playing Atari 2600 console games , 2010 .
[21] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.
[22] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[23] Joel Veness,et al. A Monte-Carlo AIXI Approximation , 2009, J. Artif. Intell. Res..
[24] Michael L. Littman,et al. Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search , 2011, UAI.
[25] Risto Miikkulainen,et al. HyperNEAT-GGP: a hyperNEAT-based atari general game player , 2012, GECCO '12.
[26] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[27] Peter Dayan,et al. Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search , 2012, NIPS.
[28] Richard S. Sutton,et al. Temporal-difference search in computer Go , 2012, Machine Learning.
[29] Joel Veness,et al. Sparse Sequential Dirichlet Coding , 2012, ArXiv.
[30] Marcus Hutter,et al. Context tree maximizing reinforcement learning , 2012, AAAI 2012.
[31] Marc G. Bellemare,et al. Sketch-Based Linear Value Function Approximation , 2012, NIPS.
[32] Joel Veness,et al. Context Tree Switching , 2011, 2012 Data Compression Conference.
[33] Alborz Geramifard,et al. Reinforcement learning with misspecified model classes , 2013, 2013 IEEE International Conference on Robotics and Automation.
[34] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..