Non-Linear Stochastic Control in Continuous State Spaces by Exact Integration in Bellman's Equations
暂无分享,去创建一个
[1] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .
[2] Geoffrey J. Gordon. Stable Fitted Reinforcement Learning , 1995, NIPS.
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Andrew G. Barto,et al. Elevator Group Control Using Multiple Reinforcement Learning Agents , 1998, Machine Learning.
[5] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network , 1995, NIPS 1995.
[6] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[7] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[8] Sebastian Thrun,et al. Probabilistic Algorithms in Robotics , 2000, AI Mag..
[9] T. Poggio,et al. Networks and the best approximation property , 1990, Biological Cybernetics.
[10] Mark W. Spong,et al. The swing up control problem for the Acrobot , 1995 .
[11] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[12] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[13] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[14] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .
[15] Wolfram Burgard,et al. A Probabilistic Approach to Concurrent Mapping and Localization for Mobile Robots , 1998, Auton. Robots.
[16] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD-lambda Network , 1995, NIPS.
[17] Dirk Ormoneit,et al. Kernel-Based Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.