暂无分享,去创建一个
[1] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[2] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[3] Richard S. Sutton,et al. Sample-based learning and search with permanent and transient memories , 2008, ICML '08.
[4] Doina Precup,et al. Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning , 2004, ECML.
[5] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[6] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[7] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..
[8] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[9] Alborz Geramifard,et al. Online Discovery of Feature Dependencies , 2011, ICML.
[10] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[11] S. Whiteson,et al. Adaptive Tile Coding for Value Function Approximation , 2007 .
[12] Stephen Lin,et al. Evolutionary Tile Coding: An Automated State Abstraction Algorithm for Reinforcement Learning , 2010, Abstraction, Reformulation, and Approximation.
[13] Nathan R. Sturtevant,et al. Feature Construction for Reinforcement Learning in Hearts , 2006, Computers and Games.
[14] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.
[15] Ronald Parr,et al. Greedy Algorithms for Sparse Reinforcement Learning , 2012, ICML.
[16] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..
[17] Thomas L. Griffiths,et al. Compositionality in rational analysis: grammar-based induction for concept learning , 2008 .
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Carlos Guestrin,et al. Max-norm Projections for Factored MDPs , 2001, IJCAI.