Robust shortest path planning and semicontractive dynamic programming

In this article, we consider shortest path problems in a directed graph where the transitions between nodes are subject to uncertainty. We use a minimax formulation, where the objective is to guarantee that a special destination state is reached with a minimum cost path under the worst possible instance of the uncertainty. Problems of this type arise, among others, in planning and pursuit‐evasion contexts, and in model predictive control. Our analysis makes use of the recently developed theory of abstract semicontractive dynamic programming models. We investigate questions of existence and uniqueness of solution of the optimality equation, existence of optimal paths, and the validity of various algorithms patterned after the classical methods of value and policy iteration, as well as a Dijkstra‐like algorithm for problems with nonnegative arc lengths.© 2016 Wiley Periodicals, Inc. Naval Research Logistics 66:15–37, 2019

[1]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[2]  E. Denardo CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .

[3]  Stuart E. Dreyfus,et al.  An Appraisal of Some Shortest-Path Algorithms , 1969, Oper. Res..

[4]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[5]  D. Bertsekas Control of uncertain systems with a set-membership description of the uncertainty , 1971 .

[6]  D. Bertsekas,et al.  Recursive state estimation for a set-membership description of uncertainty , 1971 .

[7]  D. Bertsekas,et al.  Sufficiently informative functions and the minimax feedback control of uncertain dynamic systems , 1973 .

[8]  D. Bertsekas Monotone Mappings with Application in Dynamic Programming , 1977 .

[9]  Stanley R. Pliska ON THE TRANSIENT CASE FOR MARKOV DECISION CHAINS WITH GENERAL STATE SPACES , 1978 .

[10]  T. D. Parsons,et al.  Pursuit-evasion in a graph , 1978 .

[11]  Uriel G. Rothblum,et al.  Optimal stopping, exponential utility, and linear programming , 1979, Math. Program..

[12]  Christos H. Papadimitriou,et al.  The complexity of searching a graph , 1981, 22nd Annual Symposium on Foundations of Computer Science (sfcs 1981).

[13]  Peter Whittle,et al.  Optimization Over Time , 1982 .

[14]  D. R. Lick,et al.  The Theory and Applications of Graphs. , 1983 .

[15]  Dimitri P. Bertsekas,et al.  Distributed asynchronous computation of fixed points , 1983, Math. Program..

[16]  R. González,et al.  On deterministic control problems: An approximation procedure for the optimal cost , 1983, The 22nd IEEE Conference on Decision and Control.

[17]  Rolf van Dawen,et al.  Negative Dynamic Programming , 1984 .

[18]  R. González,et al.  On Deterministic Control Problems: An Approximation Procedure for the Optimal Cost I. The Stationary Problem , 1985 .

[19]  M. Falcone A numerical approach to the infinite horizon problem of deterministic control theory , 1987 .

[20]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[21]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[22]  Tamer Basar,et al.  H∞-Optimal Control and Related , 1991 .

[23]  M. Bardi,et al.  Approximation of differential games of pursuit-evasion by discrete-time games , 1991 .

[24]  Andrew B. Kahng,et al.  Optimal robust path planning in general environments , 1993, IEEE Trans. Robotics Autom..

[25]  Ronald J. Williams,et al.  Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding Actor-Cr , 1993 .

[26]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[27]  M. Falcone,et al.  Fully Discrete Schemes for the Value Function of Pursuit-Evasion Games , 1994 .

[28]  J. Tsitsiklis,et al.  Efficient algorithms for globally optimal trajectories , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[29]  D. Bertsekas,et al.  Efficient algorithms for continuous-space shortest path problems , 1995 .

[30]  G. Olsder New trends in dynamic games and applications , 1995 .

[31]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[32]  M. Falcone,et al.  Convergence of Discrete Schemes for Discontinuous Value Functions of Pursuit-Evasion Games , 1995 .

[33]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[34]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[35]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[36]  John N. Tsitsiklis,et al.  Implementation of efficient algorithms for globally optimal trajectories , 1998, IEEE Trans. Autom. Control..

[37]  Dimitri P. Bertsekas,et al.  Network optimization : continuous and discrete models , 1998 .

[38]  Yang Jian,et al.  On the robust shortest path problem , 1998, Comput. Oper. Res..

[39]  Leonidas J. Guibas,et al.  A Visibility-Based Pursuit-Evasion Problem , 1999, Int. J. Comput. Geom. Appl..

[40]  Franco Blanchini,et al.  Set invariance in control , 1999, Autom..

[41]  Jay H. Lee,et al.  Model predictive control: past, present and future , 1999 .

[42]  D. Bertsekas,et al.  Stochastic Shortest Path Games , 1999 .

[43]  O. Hernández-Lerma,et al.  Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability, and Transient Models , 1999 .

[44]  E. Kerrigan Robust Constraint Satisfaction: Invariant Sets and Predictive Control , 2000 .

[45]  Alex M. Andrew,et al.  Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science (2nd edition) , 2000 .

[46]  H. Kushner Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .

[47]  David Q. Mayne,et al.  Constrained model predictive control: Stability and optimality , 2000, Autom..

[48]  Horst A. Eiselt,et al.  Shortest Path Problems , 2000 .

[49]  Stephen D. Patek,et al.  On terminating Markov decision processes with a risk-averse objective function , 2001, Autom..

[50]  Jan M. Maciejowski,et al.  Predictive control : with constraints , 2002 .

[51]  S. Shankar Sastry,et al.  Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation , 2002, IEEE Trans. Robotics Autom..

[52]  Melvyn Sim,et al.  Robust discrete optimization and network flows , 2003, Math. Program..

[53]  Roberto Montemanni,et al.  An exact algorithm for the robust shortest path problem with interval data , 2004, Comput. Oper. Res..

[54]  Ron Kimmel,et al.  Fast Marching Methods , 2004 .

[55]  Karl-Heinz Waldmann,et al.  Algorithms for Countable State Markov Decision Models with an Absorbing Set , 2005, SIAM J. Control. Optim..

[56]  Paola Festa,et al.  Shortest Path Algorithms , 2006, Handbook of Optimization in Telecommunications.

[57]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[58]  E. J. Collins,et al.  An analysis of transient Markov decision processes , 2006, Journal of Applied Probability.

[59]  Emanuel Todorov,et al.  Optimal Control Theory , 2006 .

[60]  F.,et al.  Parallel Shortest Paths Methods for Globally Optimal Trajectories , 2007 .

[61]  Eduardo F. Camacho,et al.  Constrained Model Predictive Control , 2007 .

[62]  Franco Blanchini,et al.  Set-theoretic methods in control , 2007 .

[63]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[64]  L. Grüne,et al.  Global Optimal Control of Perturbed Systems , 2007 .

[65]  Blai Bonet,et al.  On the Speed of Convergence of Value Iteration on Stochastic Shortest-Path Problems , 2007, Math. Oper. Res..

[66]  Thomas A. Henzinger,et al.  Concurrent reachability games , 2007, Theor. Comput. Sci..

[67]  Alexander Vladimirsky,et al.  Label-Setting Methods for Multimode Stochastic Shortest Path Problems on Graphs , 2007, Math. Oper. Res..

[68]  João Pedro Hespanha,et al.  On Discrete-Time Pursuit-Evasion Games With Sensing Limitations , 2008, IEEE Transactions on Robotics.

[69]  Laurent El Ghaoui,et al.  Robust Optimization , 2021, ICORES.

[70]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[71]  Dimitri P. Bertsekas,et al.  Q-learning and enhanced policy iteration in discounted dynamic programming , 2010, 49th IEEE Conference on Decision and Control (CDC).

[72]  Y. Tsoi,et al.  Stereoselective Synthesis of (E)-α,β-Diarylacrylates , 2010 .

[73]  Ravindra K. Ahuja,et al.  Network Flows , 2011 .

[74]  Constantine Caramanis,et al.  Theory and Applications of Robust Optimization , 2010, SIAM Rev..

[75]  Alexander Vladimirsky,et al.  Fast Two-scale Methods for Eikonal Equations , 2011, SIAM J. Sci. Comput..

[76]  Ian M. Mitchell,et al.  An Ordered Upwind Method with Precomputed Stencil and Monotone Node Acceptance for Solving Static Convex Hamilton-Jacobi Equations , 2012, J. Sci. Comput..

[77]  Bart Selman,et al.  Probabilistic planning with non-linear utility functions and worst-case guarantees , 2012, AAMAS.

[78]  Dimitri P. Bertsekas,et al.  Q-learning and policy iteration algorithms for stochastic shortest path problems , 2012, Annals of Operations Research.

[79]  Dimitri P. Bertsekas,et al.  Abstract Dynamic Programming , 2013 .

[80]  Dimitri P. Bertsekas,et al.  Stochastic Shortest Path Problems Under Weak Conditions , 2013 .

[81]  Özlem Çavus,et al.  Computational Methods for Risk-Averse Undiscounted Transient Markov Models , 2014, Oper. Res..

[82]  D. Bertsekas Infinite-Space Shortest Path Problems and Semicontractive Dynamic Programming † , 2014 .

[83]  Alexander Vladimirsky,et al.  Causal Domain Restriction for Eikonal Equations , 2013, SIAM J. Sci. Comput..

[84]  Huizhen Yu Stochastic Shortest Path Games and Q-Learning , 2014, 1412.8570.

[85]  Khadir Mohamed,et al.  Model Predictive Control: Theory and Design , 2014 .

[86]  J. Andrews,et al.  Deterministic control of randomly-terminated processes , 2013, 1310.7161.

[87]  Jean-Marie Mirebeau,et al.  Efficient fast marching with Finsler metrics , 2012, Numerische Mathematik.

[88]  Dimitri P. Bertsekas,et al.  A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies , 2013, Math. Oper. Res..

[89]  Angelika Bayer,et al.  Ellipsoidal Calculus For Estimation And Control , 2016 .

[90]  Alain Haurie,et al.  A Cost-Effectiveness Differential Game Model for Climate Agreements , 2016, Dyn. Games Appl..

[91]  Juan Pablo Maldonado López,et al.  A Dijkstra-Type Algorithm for Dynamic Games , 2016, Dyn. Games Appl..

[92]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[93]  J. Walrand,et al.  Distributed Dynamic Programming , 2022 .