暂无分享,去创建一个
[1] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[2] Wolfram Burgard,et al. Deep reinforcement learning with successor features for navigation across similar environments , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[3] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[4] B. Nordstrom. FINITE MARKOV CHAINS , 2005 .
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[7] Alborz Geramifard,et al. iLSTD: Eligibility Traces and Convergence Analysis , 2006, NIPS.
[8] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[9] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[10] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[11] Alborz Geramifard,et al. Incremental Least-Squares Temporal Difference Learning , 2006, AAAI.
[12] Tsuyoshi Murata,et al. {m , 1934, ACML.
[13] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[14] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[16] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[17] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[18] Stefanie Tellex,et al. Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning , 2017, ArXiv.
[19] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[20] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .