TD(λ) networks: temporal-difference networks with eligibility traces
暂无分享,去创建一个
[1] Luc De Raedt,et al. Proceedings of the 22nd international conference on Machine learning , 2005 .
[2] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[3] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[4] Robert E. Schapire,et al. A new approach to unsupervised learning in deterministic environments , 1990 .
[5] K. Aberer,et al. German National Research Center for Information Technology , 2007 .
[6] Michael R. James,et al. Learning predictive state representations in dynamical systems without reset , 2005, ICML.
[7] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[8] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[9] Satinder P. Singh,et al. A Nonlinear Predictive State Representation , 2003, NIPS.
[10] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[11] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[12] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[13] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[14] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[15] Richard S. Sutton,et al. Temporal-Difference Networks with History , 2005, IJCAI.
[16] H. Jaeger. Discrete-time, discrete-valued observable operator models: a tutorial , 2003 .
[17] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[18] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[19] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[20] Robert E. Schapire,et al. A new approach to unsupervised learning in deterministic environments , 1990 .