暂无分享,去创建一个
[1] Ronald A. Howard,et al. Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..
[2] Brian D. Ripley,et al. Stochastic Simulation , 2005 .
[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[4] Stuart J. Russell,et al. Do the right thing - studies in limited rationality , 1991 .
[5] Stuart J. Russell,et al. Do the right thing , 1991 .
[6] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[7] Simon Parsons,et al. Do the right thing - studies in limited rationality by Stuart Russell and Eric Wefald, MIT Press, Cambridge, MA, £24.75, ISBN 0-262-18144-4 , 1994, The Knowledge Engineering Review.
[8] Stuart J. Russell,et al. Stochastic simulation algorithms for dynamic probabilistic networks , 1995, UAI.
[9] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[10] David Andre,et al. Generalized Prioritized Sweeping , 1997, NIPS.
[11] Yoram Singer,et al. Efficient Bayesian Parameter Estimation in Large Discrete Domains , 1998, NIPS.
[12] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[13] Daphne Koller,et al. Using Learning for Approximation in Stochastic Processes , 1998, ICML.