Reinforcement learning

[1]  P. Schrimpf,et al.  Dynamic Programming , 2011 .

[2]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[5]  A. Barto,et al.  Learning and Sequential Decision Making , 1989 .

[6]  G. Grisetti,et al.  Further Reading , 1984, IEEE Spectrum.