Neuro-Dynamic Programming: An Overview and Recent Results
暂无分享,去创建一个
Neuro-dynamic programming is a methodology for sequential decision making under uncertainty, which is based on dynamic programming. The key idea is to use a scoring function to select decisions in complex dynamic systems, arising in a broad variety of applications from engineering design, operations research, resource allocation, finance, etc. This is much like what is done in computer chess, where positions are evaluated by means of a scoring function and the move that leads to the position with the best score is chosen. Neuro-dynamic programming provides a class of systematic methods for computing appropriate scoring functions using approximation schemes and simulation/evaluation of the system’s performance.
[1] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[2] Dimitri P. Bertsekas,et al. Temporal Dierences-Based Policy Iteration and Applications in Neuro-Dynamic Programming 1 , 1997 .
[3] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[4] A. Barto,et al. Improved Temporal Difference Methods with Linear Function Approximation , 2004 .