暂无分享,去创建一个
[1] R. Durrett. Probability: Theory and Examples , 1993 .
[2] D. Pollard. Empirical Processes: Theory and Applications , 1990 .
[3] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..
[4] Paul W. Goldberg,et al. Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parameterized by Real Numbers , 1993, COLT '93.
[5] Shigenobu Kobayashi,et al. Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward , 1995, ICML.
[6] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[7] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[8] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[9] Benjamin Van Roy. Learning and value function approximation in complex decision processes , 1998 .
[10] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[11] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[12] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[13] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[14] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.