论文信息 - Weighted Reward Criteria in Competitive Markov Decision Processes

Weighted Reward Criteria in Competitive Markov Decision Processes

Abstract We consider Competitive Markov Decision Processes in which the controllers/players are antagonistic and aggregate their sequences of expected rewards according to “weighted” or “horizon-sensitive” criteria. These are either a convex combination of two discounted objectives, or of one discounted and one limiting average reward objective. In both cases we establish the existence of the game-theoretic value vector, and supply a description of c-optimal non-stationary strategies.

Jerzy A. Filar | O. J. Vrieze

[1] W. Whitt. Representation and Approximation of Noncooperative Sequential Games , 1980 .

[2] A. Federgruen. On N-person stochastic games by denumerable state space , 1978, Advances in Applied Probability.

[3] D. Blackwell,et al. THE BIG MATCH , 1968, Classics in Game Theory.

[4] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.