论文信息 - The linear programming approach to approximate dynamic programming: theory and application

The linear programming approach to approximate dynamic programming: theory and application

[12] J. Baxter,et al. Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[13] John N. Tsitsiklis,et al. Call admission control and routing in integrated services networks using neuro-dynamic programming , 2000, IEEE Journal on Selected Areas in Communications.

[14] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[15] Vivek S. Borkar,et al. Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..

[16] J. R. Morrison,et al. New Linear Program Performance Bounds for Queueing Networks , 1999 .

[17] R. Dudley,et al. Uniform Central Limit Theorems: Notation Index , 2014 .

[18] Christine A. Shoemaker,et al. Applying Experimental Design and Regression Splines to High-Dimensional Continuous-State Stochastic Dynamic Programming , 1999, Oper. Res..

[19] Vijay R. Konda,et al. Actor-Critic Algorithms , 1999, NIPS.

[20] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .

[21] Sean P. Meyn,et al. Value iteration and optimization of multiclass queueing networks , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[22] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[23] Benjamin Van Roy. Learning and value function approximation in complex decision processes , 1998 .

[24] Mathukumalli Vidyasagar,et al. A Theory of Learning and Generalization , 1997 .

[25] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.

[26] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network , 1995, NIPS 1995.

[27] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[28] Christopher M. Bishop,et al. Neural networks for pattern recognition , 1995 .

[29] R. Durrett. Probability: Theory and Examples , 1993 .

[30] V. Borkar. A convex analytic approach to Markov decision processes , 1988 .

[31] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .

[32] A. Hordijk,et al. Linear Programming and Markov Decision Chains , 1979 .

[33] E. Denardo. On Linear Programming in a Markov Decision Problem , 1970 .

[34] A. F. Veinott. Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .

[35] D. Luenberger. Optimization by Vector Space Methods , 1968 .

[36] F. d'Epenoux,et al. A Probabilistic Production and Inventory Problem , 1963 .