ON THE EXISTENCE OF OPTIMAL CONTROL IN CONTINUOUS TIME MARKOV DECISION PROCESSES

In this paper we shall treat the optimal control problem in continuous time Markov decision processes having a Borel state space and a compact action space varying with both the time and the state. The cost functional we consider here is the sum of the integral over the finite horizon of a return rate which depends on both the controller and the corresponding response, and the expected return of the system at the final fixed time. Our optimal control problem is to find a controller which maximzie the cost functional over the given planning horizon. Main results are a necessary and sufficient condition for an optimality, and an algorithm for finding the optimal controller. B. L. Miller [1] treated the problem similar to ours, but his paper was restricted to the case of the finite state space and action space. Our situation is succeeded owing to the implicit function's lemma of K. Tsuji and N. Furukawa [3]. The method of construction of our algorithm is often used in Dynamic Programming problem, for example in [4].