论文信息 - Mean, variance, and probabilistic criteria in finite Markov decision processes: A review - 字舞流文

Mean, variance, and probabilistic criteria in finite Markov decision processes: A review

This paper is a survey of papers which make use of nonstandard Markov decision process criteria (i.e., those which do not seek simply to optimize expected returns per unit time or expected discounted return). It covers infinite-horizon nondiscounted formulations, infinite-horizon discounted formulations, and finite-horizon formulations. For problem formulations in terms solely of the probabilities of being in each state and taking each action, policy equivalence results are given which allow policies to be restricted to the class of Markov policies or to the randomizations of deterministic Markov policies. For problems which cannot be stated in such terms, in terms of the primitive state setI, formulations involving a redefinition of the states are examined.

[1] A. Charnes,et al. Chance-Constrained Programming , 1959 .

[2] G. Dantzig,et al. THE DECOMPOSITION ALGORITHM FOR LINEAR PROGRAMS , 1961 .

[3] Abraham Charnes,et al. Chance Constraints and Normal Deviates , 1962 .

[4] C. Derman. Optimal Replacement and Maintenance Under Markovian Deterioration with Probability Bounds on Failure , 1963 .

[5] C. Derman. Stable sequential control rules and Markov chains , 1963 .

[6] C. Derman. On Sequential Control Processes , 1964 .

[7] C. Derman,et al. Some Remarks on Finite Horizon Markovian Decision Models , 1965 .

[8] C. Derman,et al. A Note on Memoryless Rules for Controlling Sequential Control Processes , 1966 .

[9] H. J. Greenberg. Dynamic Programming with Linear Uncertainty , 1968, Oper. Res..

[10] D. J. White,et al. Fundamentals of decision theory , 1969 .

[11] A. Beja. Probability Bounds in Replacement Policies for Markov Systems , 1969 .

[12] Cyrus Derman,et al. Finite State Markovian Decision Processes , 1970 .

[13] S. C. Jaquette. Markov Decision Processes with a New Optimality Criterion: Small Interest Rates , 1972 .

[14] C. Derman,et al. Constrained Markov Decision Chains , 1972 .

[15] R. Howard,et al. Risk-Sensitive Markov Decision Processes , 1972 .

[16] D. J. White. Technical Note - Dynamic Programming and Probabilistic Constraints , 1974, Oper. Res..

[17] M. J. Sobel. Ordinal Dynamic Programming , 1975 .

[18] Evan L. Porteus. On the Optimality of Structured Policies in Countable Stage Decision Processes , 1975 .

[19] S. C. Jaquette. A Utility Criterion for Markov Decision Processes , 1976 .

[20] Juval Goldwerger. Dynamic Programming for a Stochastic Markovian Process with an Application to the Mean Variance Models , 1977 .

[21] David M. Kreps. Decision Problems with Expected Utility Critera, I: Upper and Lower Convergent Utility , 1977, Math. Oper. Res..

[22] David M. Kreps. Decision Problems with Expected Utility Criteria, II: Stationarity , 1977, Math. Oper. Res..

[23] B. L. Miller. Communication---On “Dynamic Programming for a Stochastic Markovian Process with an Application to the Mean Variance Models” by J. Goldwerger , 1978 .

[24] E. Steinberg,et al. A Preference Order Dynamic Program for a Knapsack Problem with Stochastic Rewards , 1979 .

[25] Roy Mendelssohn. A systematic approach to determining mean-variance tradeoffs when managing randomly varying populations , 1980 .

[26] Moshe Sniedovich,et al. Preference Order Stochastic Knapsack Problems: Methodological Issues , 1980 .

[27] James G. Morris,et al. Decision Problems Under Risk and Chance Constrained Programming: Dilemmas in the Transition , 1981 .

[28] M. J. Sobel. The variance of discounted Markov decision processes , 1982 .

[29] D. White. Optimality and efficiency , 1982 .

[30] Moshe Sniedovich. A Class of Variance-Constrained Problems , 1983, Oper. Res..

[31] L. C. M. Kallenberg,et al. Linear programming and finite Markovian control problems , 1984 .

[32] Jerzy A. Filar,et al. Percentiles and markovian decision processes , 1983 .

[33] Uriel G. Rothblum,et al. Multiplicative Markov Decision Chains , 1984, Math. Oper. Res..

[34] Arie Hordijk,et al. Constrained Undiscounted Stochastic Dynamic Programming , 1984, Math. Oper. Res..

[35] J. Filar,et al. Gain/variability tradeoffs in undiscounted Markov decision processes , 1985, 1985 24th IEEE Conference on Decision and Control.

[36] M. J. Sobel. Maximal mean/standard deviation ratio in an undiscounted MDP , 1985 .