Residual Algorithms: Reinforcement Learning with Function Approximation
暂无分享,去创建一个
[1] Terry Jones,et al. Crossover, Macromutationand, and Population-Based Search , 1995, ICGA.
[2] Paul J. Werbos,et al. Consistency of HDP applied to a simple reinforcement learning problem , 1990, Neural Networks.
[3] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[4] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[5] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[6] Patrik D'haeseleer,et al. Context preserving crossover in genetic programming , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.
[7] Gerald Tesauro,et al. Neurogammon: a neural-network backgammon program , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[8] R. A. Fisher,et al. The Genetical Theory of Natural Selection , 1931 .
[9] Steven J. Bradtke,et al. Reinforcement Learning Applied to Linear Quadratic Regulation , 1992, NIPS.
[10] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .
[11] John R. Koza,et al. A genetic approach to the truck backer upper problem and the inter-twined spiral problem , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[12] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[13] E. Odum. Fundamentals of ecology , 1972 .