Dynamic Shortest Paths in Acyclic Networks with Markovian Arc Costs

We examine shortest path problems in acyclic networks in which arc costs are known functions of certain environment variables at network nodes. Each of these variables evolves according to an independent Markov process. The vehicle can wait at a node (at a cost) in anticipation of more favorable arc costs. We first develop two recursive procedures for the individual arc case, one based on successive approximations, and the other on policy iteration. We also solve the same problem via parametric linear programming. We show that the optimal policy essentially classifies the state of the environment variable at a node into two categories: green states for which the optimal action is to immediately traverse the arc, and red states for which the optimal action is to wait. We then extend these concepts for the entire network by developing a dynamic programming procedure that solves the corresponding problem. The complexity of this method is shown to be O(n2K + nK3), where n is the number of network nodes and K ...