Flow control using the theory of zero sum Markov games

The author considers the problem of dynamic flow control of arriving packets into an infinite buffer. The service rate may depend on the state of the system, may change in time, and is unknown to the controller. The goal of the controller is to design an efficient policy which guarantees the best performance under the worst service conditions. The cost is composed of a holding cost, a cost for rejecting customers (packets) and a cost that depends on the quality of the service. The problem is studied in the framework of zero-sum Markov games, and a value iteration algorithm is used to solve it. It is shown that there exists an optimal stationary policy (such that the decisions depend only on the actual number of customers in the queue); it is of a threshold type, and it uses randomization in at most one state.<<ETX>>