Generalized inverses in discrete time Markov decision process
暂无分享,去创建一个
A new self-contained approach based on the Drazin generalized inverse is used to derive many basic results in discrete time, finite state Markov decision processes. A product form representation for the transition matrix of a stationary policy gives new derivations of the average reward evaluation equations, Laurent series expansions, as well as the finite test for Blackwell optimality. This representation also suggests new computational methods.