Optimistic Planning of Deterministic Systems
暂无分享,去创建一个
[1] Andrew P. Sage,et al. Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[4] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[5] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[6] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[7] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[8] Frédérick Garcia,et al. On-Line Search for Solving Markov Decision Processes via Heuristic Sampling , 2004, ECAI.
[9] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[10] Frederick Garcia. On-line search for solving large Markov de-cision processes , 2004 .
[11] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[12] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[13] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[14] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[15] Leon G. Higley,et al. Forensic Entomology: An Introduction , 2009 .
[16] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .