Adaptive Policies in Markov Decision Processes with Uncertain Transition Matrices
暂无分享,去创建一个
[1] U. Rieder. Bayesian dynamic programming , 1975, Advances in Applied Probability.
[2] D. Blackwell. Discounted Dynamic Programming , 1965 .
[3] C. Derman. On Sequential Decisions and Markov Chains , 1962 .
[4] B. L. Miller,et al. An Optimality Condition for Discrete Dynamic Programming with no Discounting , 1968 .
[5] Adaptive competitive decision in repeated play of a matrix game with uncertain entries , 1968 .
[6] B. Fox,et al. Adaptive Policies for Markov Renewal Programs , 1973 .