On the existence of relative values for undiscounted Markovian decision processes with a scalar gain rate

Abstract The functional equations v = max{q(f) − gT(f) + P(f) v; f ϵ K}  Qv of undiscounted semi-Markovian decision processes are shown to be solvable if and only if all components of the maximum gain rate vector are equal. More generally, in the multichain case, the functional equations for the value vector possess a solution if and only if there is a policy which achieves the maximal gain vector. The method of proof exhibits vectors v± such that Qv+ ⩽ v+ and Qv− ⩾ v−.