The linear programming approach to approximate dynamic programming: theory and application

[1]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[2]  John N. Tsitsiklis,et al.  Feature-based methods for large scale dynamic programming , 2004, Machine Learning.

[3]  Shobha Venkataraman,et al.  Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..

[4]  Benjamin Van Roy Neuro-Dynamic Programming: Overview and Recent Trends , 2002 .

[5]  J. Tsitsiklis,et al.  Performance of Multiclass Markovian Queueing Networks Via Piecewise Linear Lyapunov Functions , 2001 .

[6]  Mark S. Squillante,et al.  On maximizing service-level-agreement profits , 2001, PERV.

[7]  John N. Tsitsiklis,et al.  Regression methods for pricing complex American-style options , 2001, IEEE Trans. Neural Networks.

[8]  Francis A. Longstaff,et al.  Valuing American Options by Simulation: A Simple Least-Squares Approach , 2001 .

[9]  Sean P. Meyn Sequencing and Routing in Multiclass Queueing Networks Part I: Feedback Regulation , 2001, SIAM J. Control. Optim..

[10]  John N. Tsitsiklis,et al.  Simulation-based optimization of Markov reward processes , 2001, IEEE Trans. Autom. Control..

[11]  Dale Schuurmans,et al.  Direct value-approximation for factored MDPs , 2001, NIPS.

[12]  J. Baxter,et al.  Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[13]  John N. Tsitsiklis,et al.  Call admission control and routing in integrated services networks using neuro-dynamic programming , 2000, IEEE Journal on Selected Areas in Communications.

[14]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[15]  Vivek S. Borkar,et al.  Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..

[16]  J. R. Morrison,et al.  New Linear Program Performance Bounds for Queueing Networks , 1999 .

[17]  R. Dudley,et al.  Uniform Central Limit Theorems: Notation Index , 2014 .

[18]  Christine A. Shoemaker,et al.  Applying Experimental Design and Regression Splines to High-Dimensional Continuous-State Stochastic Dynamic Programming , 1999, Oper. Res..

[19]  Vijay R. Konda,et al.  Actor-Critic Algorithms , 1999, NIPS.

[20]  Geoffrey J. Gordon,et al.  Approximate solutions to markov decision processes , 1999 .

[21]  Sean P. Meyn,et al.  Value iteration and optimization of multiclass queueing networks , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[22]  Andrew W. Moore,et al.  Gradient Descent for General Reinforcement Learning , 1998, NIPS.

[23]  Benjamin Van Roy Learning and value function approximation in complex decision processes , 1998 .

[24]  Mathukumalli Vidyasagar,et al.  A Theory of Learning and Generalization , 1997 .

[25]  John N. Tsitsiklis,et al.  Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.

[26]  Thomas G. Dietterich,et al.  High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network , 1995, NIPS 1995.

[27]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[28]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[29]  R. Durrett Probability: Theory and Examples , 1993 .

[30]  V. Borkar A convex analytic approach to Markov decision processes , 1988 .

[31]  P. Schweitzer,et al.  Generalized polynomial approximations in Markovian decision processes , 1985 .

[32]  A. Hordijk,et al.  Linear Programming and Markov Decision Chains , 1979 .

[33]  E. Denardo On Linear Programming in a Markov Decision Problem , 1970 .

[34]  A. F. Veinott Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .

[35]  D. Luenberger Optimization by Vector Space Methods , 1968 .

[36]  F. d'Epenoux,et al.  A Probabilistic Production and Inventory Problem , 1963 .