Dynamic Routing and Wavelength Assignment Using First Policy Iteration, Inhomogeneous Traffic Case

The routing and wavelength assignment problem (RWA) in WDM network can be viewed as a Markov Decision Process (MDP). The problem, however, defies calculation of the exact solution because of the huge size of the state space. Several heuristic algorithms have been presented in the literature. Generally, these algorithms, however, do not take into account the available extra information about the traffic, e.g. inhomogeneous arrival rates. In this paper we propose an approach where, starting from a given heuristic algorithm, one obtains a better algorithm by the first policy iteration. At each decision epoch a decision analysis is made where the costs of all the alternative actions are estimated by simulations on the fly. Being computationally intensive, this method can be used in real time only for systems with slow dynamics. Off-line it can be used to assess how close the heuristic algorithms come to the optimal policy. Numerical examples are given about the policy improvement.