Separable Dynamic Programming and Approximate Decomposition Methods

We consider control, planning, and resource allocation problems involving several independent subsystems that are coupled through a control/decision constraint. We discuss one-step lookahead methods that use an approximate cost-to-go function derived from the solution of single subsystem problems. We propose a new method for constructing such approximations, and derive bounds on the performance of the associated suboptimal policies. We then specialize this method to problems of reachability of target tubes that have the form of a box (a Cartesian product of subsystem tubes). We thus obtain inner approximating tubes, which are the union of a finite number of boxes, each involving single subsystem calculations

[1]  S. Dreyfus Dynamic Programming and the Calculus of Variations , 1960 .

[2]  Arthur M. Geoffrion,et al.  Elements of large-scale mathematical programming , 1969 .

[3]  Arthur M. Geoffrion,et al.  Elements of Large-Scale Mathematical Programming Part I: Concepts , 1970 .

[4]  Leon S. Lasdon,et al.  Optimization Theory of Large Systems , 1970 .

[5]  D. Bertsekas Control of uncertain systems with a set-membership description of the uncertainty , 1971 .

[6]  D. Bertsekas,et al.  On the minimax reachability of target sets and target tubes , 1971 .

[7]  D. Bertsekas Infinite time reachability of state-space regions by using feedback control , 1972 .

[8]  Joseph Githu Kimemia,et al.  Hierarchial control of production in flexible manufacturing systems , 1982 .

[9]  Stanley B. Gershwin,et al.  Computation of production control policies by a dynamic programming technique , 1982 .

[10]  J. Deller Set membership identification in digital signal processing , 1989, IEEE ASSP Magazine.

[11]  Stephen P. Boyd,et al.  Identification of Systems with Parametric and Nonparametric Uncertainty , 1990, 1990 American Control Conference.

[12]  Bruce Abramson,et al.  Expected-Outcome: A General Model of Static Evaluation , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Michael C. Caramanis,et al.  Near optimal manufacturing flow controller design , 1991 .

[14]  Stephen P. Boyd,et al.  Set-membership identification of systems with parametric and nonparametric uncertainty , 1992 .

[15]  Stanley B. Gershwin,et al.  Manufacturing Systems Engineering , 1993 .

[16]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[17]  Gerald Tesauro,et al.  On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.

[18]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[19]  James D Christodouleas Solution methods for multiprocessor network scheduling problems, with application to railroad operations , 1997 .

[20]  Dimitri P. Bertsekas,et al.  Rollout Algorithms for Stochastic Scheduling Problems , 1999, J. Heuristics.

[21]  Franco Blanchini,et al.  Set invariance in control , 1999, Autom..

[22]  Jay H. Lee,et al.  Model predictive control: past, present and future , 1999 .

[23]  Nicola Secomandi,et al.  Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands , 2000, Comput. Oper. Res..

[24]  David Q. Mayne,et al.  Constrained model predictive control: Stability and optimality , 2000, Autom..

[25]  David Q. Mayne,et al.  Control of Constrained Dynamic Systems , 2001, Eur. J. Control.

[26]  Nicola Secomandi,et al.  A Rollout Policy for the Vehicle Routing Problem with Stochastic Demands , 2001, Oper. Res..

[27]  Michael C. Ferris,et al.  Neuro-Dynamic Programming for Radiation Treatment Planning , 2002 .

[28]  Francesca Guerriero,et al.  A cooperative parallel rollout algorithm for the sequential ordering problem , 2003, Parallel Comput..

[29]  Frank Allgöwer,et al.  State and Output Feedback Nonlinear Model Predictive Control: An Overview , 2003, Eur. J. Control.

[30]  Ioana Popescu,et al.  Revenue Management in a Dynamic Network Environment , 2003, Transp. Sci..

[31]  Krishna R. Pattipati,et al.  Rollout strategies for sequential fault diagnosis , 2003, IEEE Trans. Syst. Man Cybern. Part A.

[32]  Robert Givan,et al.  Congestion control using policy rollout , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[33]  Nicola Secomandi,et al.  Analysis of a Rollout Approach to Sequencing Problems with Stochastic Routing Applications , 2003, J. Heuristics.

[34]  Michael C. Ferris,et al.  Digital Object Identifier (DOI) 10.1007/s10107-004-0530-y , 2004 .

[35]  Benjamin Van Roy,et al.  Solitaire: Man Versus Machine , 2004, NIPS.

[36]  Robert Givan,et al.  Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes , 2004, Discret. Event Dyn. Syst..

[37]  Dario Pacciarelli,et al.  A Rollout Metaheuristic for Job Shop Scheduling Problems , 2004, Ann. Oper. Res..

[38]  Krishna R. Pattipati,et al.  Decision feedback with rollout for multiuser detection in synchronous CDMA , 2004 .

[39]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC , 2005, Eur. J. Control.

[40]  Robert Givan,et al.  Sampling Techniques for Markov Games Approximation Results on Sampling Techniques for Zero-sum , Discounted Markov Games , 2007 .