Speeding up Q(lambda)-Learning
暂无分享,去创建一个
[1] Pawel Cichosz,et al. Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning , 1994, J. Artif. Intell. Res..
[2] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[3] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[4] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[5] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .
[6] Teuvo Kohonen,et al. Self-organization and associative memory: 3rd edition , 1989 .
[7] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[8] Teuvo Kohonen,et al. Self-Organization and Associative Memory , 1988 .
[9] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[10] Steven Douglas Whitehead,et al. Reinforcement learning for the adaptive control of perception and action , 1992 .
[11] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[12] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.