Rollout Algorithms for Constrained Dynamic Programming 1

The rollout algorithm is a suboptimal control method for deterministic and stochastic problems that can be solved by dynamic programming. In this short note, we derive an extension of the rollout algorithm that applies to constrained deterministic dynamic programming problems, and relies on a suboptimal policy, called base heuristic. Under suitable assumptions, we show that if the base heuristic produces a feasible solution, the rollout algorithm also produces a feasible solution, whose cost is no worse than the cost corresponding to the base heuristic.

[1]  Jeffrey M. Jaffe,et al.  Algorithms for finding paths with multiple constraints , 1984, Networks.

[2]  E. Martins On a multicriteria shortest path problem , 1984 .

[3]  Chelsea C. White,et al.  Multiobjective A* , 1991, JACM.

[4]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[5]  Gerald Tesauro,et al.  On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.

[6]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[7]  James D Christodouleas Solution methods for multiprocessor network scheduling problems, with application to railroad operations , 1997 .

[8]  Dimitri P. Bertsekas,et al.  Rollout Algorithms for Stochastic Scheduling Problems , 1999, J. Heuristics.

[9]  Nicola Secomandi,et al.  Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands , 2000, Comput. Oper. Res..

[10]  R. Musmanno,et al.  Label Correcting Methods to Solve Multicriteria Shortest Path Problems , 2001 .

[11]  Nicola Secomandi,et al.  A Rollout Policy for the Vehicle Routing Problem with Stochastic Demands , 2001, Oper. Res..

[12]  Dimitris Bertsimas,et al.  An Approximate Dynamic Programming Approach to Multidimensional Knapsack Problems , 2002, Manag. Sci..

[13]  Michael C. Ferris,et al.  Neuro-Dynamic Programming for Radiation Treatment Planning , 2002 .

[14]  Francesca Guerriero,et al.  A cooperative parallel rollout algorithm for the sequential ordering problem , 2003, Parallel Comput..

[15]  Ioana Popescu,et al.  Revenue Management in a Dynamic Network Environment , 2003, Transp. Sci..

[16]  Krishna R. Pattipati,et al.  Rollout strategies for sequential fault diagnosis , 2003, IEEE Trans. Syst. Man Cybern. Part A.

[17]  Robert Givan,et al.  Congestion control using policy rollout , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[18]  Nicola Secomandi,et al.  Analysis of a Rollout Approach to Sequencing Problems with Stochastic Routing Applications , 2003, J. Heuristics.

[19]  Andrew G. Barto,et al.  Building a Basic Block Instruction Scheduler with Reinforcement Learning and Rollouts , 2002, Machine Learning.

[20]  Michael C. Ferris,et al.  Digital Object Identifier (DOI) 10.1007/s10107-004-0530-y , 2004 .

[21]  Benjamin Van Roy,et al.  Solitaire: Man Versus Machine , 2004, NIPS.

[22]  Robert Givan,et al.  Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes , 2004, Discret. Event Dyn. Syst..

[23]  Dario Pacciarelli,et al.  A Rollout Metaheuristic for Job Shop Scheduling Problems , 2004, Ann. Oper. Res..

[24]  Krishna R. Pattipati,et al.  Decision feedback with rollout for multiuser detection in synchronous CDMA , 2004 .