Modular Value Iteration through Regional Decomposition
暂无分享,去创建一个
[1] Kevin D. Seppi,et al. Prioritization Methods for Accelerating MDP Solvers , 2005, J. Mach. Learn. Res..
[2] Jürgen Schmidhuber,et al. Sequential Constant Size Compressors for Reinforcement Learning , 2011, AGI.
[3] Angelo Cangelosi,et al. An open-source simulator for cognitive robotics research: the prototype of the iCub humanoid robot simulator , 2008, PerMIS.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] F. d'Epenoux,et al. A Probabilistic Production and Inventory Problem , 1963 .
[7] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[8] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[9] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[10] Christian Biemann,et al. Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems , 2006 .
[11] Peng Dai,et al. Prioritizing Bellman Backups without a Priority Queue , 2007, ICAPS.
[12] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[13] O. Sporns,et al. Complex brain networks: graph theoretical analysis of structural and functional systems , 2009, Nature Reviews Neuroscience.
[14] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[15] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[16] Richard F. Harris. Chinese whispers , 2003, Current Biology.
[17] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .