Finite State and Action MDPS
暂无分享,去创建一个
[1] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[2] A. S. Manne. Linear Programming and Sequential Decisions , 1960 .
[3] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[4] Cyrus Derman,et al. Replacement of periodically inspected equipment. (An optimal optional stopping rule) , 1960 .
[5] H. Robbins,et al. A Martingale System Theorem and Applications , 1961 .
[6] T. L. Saaty,et al. Progress in Operations Research. , 1961 .
[7] L. A. Zadeh,et al. Optimal Pursuit Strategies in Discrete-State Probabilistic Systems , 1962 .
[8] W. Jewell. MARKOV-RENEWAL PROGRAMMING , 1962 .
[9] M. Klein. Inspection—Maintenance—Replacement Schedules Under Markovian Deterioration , 1962 .
[10] D. Blackwell. Discrete Dynamic Programming , 1962 .
[11] C. Derman. On Sequential Decisions and Markov Chains , 1962 .
[12] D. White. Dynamic programming, Markov chains, and the method of successive approximations , 1963 .
[13] J. Neyman,et al. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability , 1963 .
[14] William S. Jewell,et al. Markov-Renewal Programming. I: Formulation, Finite Return Models , 1963 .
[15] W. Jewell. Markov-Renewal Programming. II: Infinite Return Models, Example , 1963 .
[16] D. Iglehart. Optimality of (s, S) Policies in the Infinite Horizon Dynamic Inventory Problem , 1963 .
[17] George Pâolya,et al. Applied Combinatorial Mathematics , 1964 .
[18] George B. Dantzig,et al. Linear programming and extensions , 1965 .
[19] S. Karlin,et al. Mathematical Methods in the Social Sciences , 1962 .
[20] J. S. D. Cani. A Dynamic Programming Algorithm for Embedded Markov Chains when the Planning Horizon is at Infinity , 1964 .
[21] R. Bellman. Mathematical optimization techniques , 1964 .
[22] C. Derman,et al. Some Remarks on Finite Horizon Markovian Decision Models , 1965 .
[23] P. Schweitzer. Perturbation theory and Markovian decision processes. , 1965 .
[24] Onésimo Hernández-Lerma,et al. Controlled Markov Processes , 1965 .
[25] W. Barry. On the Iterative Method of Dynamic Programming on a Finite Space Discrete Time Markov Process , 1965 .
[26] C. Derman,et al. A Note on Memoryless Rules for Controlling Sequential Control Processes , 1966 .
[27] P. Kolesar. Minimum Cost Replacement Under Markovian Deterioration , 1966 .
[28] R. Bellman. Dynamic programming. , 1957, Science.
[29] A. F. Veinott. ON FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING WITH NO DISCOUNTING , 1966 .
[30] J. MacQueen. A MODIFIED DYNAMIC PROGRAMMING METHOD FOR MARKOVIAN DECISION PROBLEMS , 1966 .
[31] Richard D. Smallwood,et al. Optimum Policy Regions for Markov Processes with Discounting , 1966, Oper. Res..
[32] R. Strauch,et al. A PROPERTY OF SEQUENTIAL CONTROL PROCESSES , 1966 .
[33] Jr. Arthur F. Veinott. On the Opimality of $( {s,S} )$ Inventory Policies: New Conditions and a New Proof , 1966 .
[34] G. D. Eppen,et al. Linear Programming Solutions for Separable Markovian Decision Problems , 1967 .
[35] E. Denardo. CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .
[36] J. MacQueen,et al. Letter to the Editor - A Test for Suboptimal Actions in Markovian Decision Problems , 1967, Oper. Res..
[37] H. Mine,et al. Linear programming algorithms for semi-Markovian decision processes , 1968 .
[38] B. L. Miller,et al. An Optimality Condition for Discrete Dynamic Programming with no Discounting , 1968 .
[39] N. A. J. Hastings,et al. Some Notes on Dynamic Programming and Replacement , 1968 .
[40] Linus Schrage,et al. Letter to the Editor - A Proof of the Optimality of the Shortest Remaining Processing Time Discipline , 1968, Oper. Res..
[41] B. Fox. (g, w)—Optima in Markov Renewal Programs , 1968 .
[42] E. Denardo,et al. Multichain Markov Renewal Programs , 1968 .
[43] E. Denardo. Separable Markovian Decision Problems , 1968 .
[44] P. Schweitzer. Perturbation theory and finite Markov chains , 1968 .
[45] N. A. J. Hastings,et al. Optimization of Discounted Markov Decision Problems , 1969 .
[46] M. Pollatschek,et al. Algorithms for Stochastic Games with Geometrical Interpretation , 1969 .
[47] A. F. Veinott. Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .
[48] B. L. Miller,et al. Discrete Dynamic Programming with a Small Interest Rate , 1969 .
[49] Sheldon M. Ross,et al. A Problem in Optimal Search and Stop , 1969, Oper. Res..
[50] N. Hastings. The Repair Limit Replacement Method , 1969 .
[51] Amedeo R. Odoni,et al. On Finding the Maximal Gain for Markov Decision Processes , 1969, Oper. Res..
[52] Steven A. Lippman,et al. Letter to the Editor - Criterion Equivalence in Discrete Dynamic Programming , 1969, Oper. Res..
[53] K. Hinderer,et al. Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .
[54] Cyrus Derman,et al. Finite State Markovian Decision Processes , 1970 .
[55] Eric V. Denardo,et al. Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision Problem , 1970, Oper. Res..
[56] Harold J. Kushner,et al. Accelerated procedures for the solution of discrete Markov control problems , 1971 .
[57] Arie Hordijk,et al. A sufficient condition for the existence of an optimal policy with respect to the average cost criterion in markovian decision processes : Prepublication , 1971 .
[58] P. Schweitzer. Iterative solution of the functional equations of undiscounted Markov renewal programming , 1971 .
[59] Paul J. Schweitzer. Multiple Policy Improvements in Undiscounted Markov Renewal Programming , 1971, Oper. Res..
[60] N. A. J. Hastings. Technical Note - Bounds on the Gain of a Markov Decision Process , 1971, Oper. Res..
[61] Evan L. Porteus. Some Bounds for Discounted Sequential Decision Processes , 1971 .
[62] E. Denardo. Markov Renewal Programs with Small Interest Rates , 1971 .
[63] Thomas E. Morton. Technical Note - Undiscounted Markov Renewal Programming Via Modified Successive Approximations , 1971, Oper. Res..
[64] H. Kushner. Introduction to stochastic control , 1971 .
[65] C. Derman,et al. Constrained Markov Decision Chains , 1972 .
[66] J. Bather. Optimal decision procedures for finite Markov chains. Part III: General convex systems , 1973 .
[67] Milton C. Chew. Optimal Stopping in a Discrete Search Problem , 1973, Oper. Res..
[68] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[69] N. Hastings,et al. Tests for Suboptimal Actions in Discounted Markov Programming , 1973 .
[70] J. Bather. Optimal decision procedures for finite Markov chains. Part II: Communicating systems , 1973, Advances in Applied Probability.
[71] Edward P. C. Kao,et al. Optimal Replacement Rules when Changes of State are Semi-Markovian , 1973, Oper. Res..
[72] E. Denardo. A Markov Decision Problem , 1973 .
[73] Richard C. Grinold,et al. Technical Note - Elimination of Suboptimal Actions in Markov Decision Problems , 1973, Oper. Res..
[74] Dieter Reetz,et al. Solution of a Markovian decision problem by successive overrelaxation , 1973, Z. Oper. Research.
[75] J. Bather. Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.
[76] Arie Hordijk,et al. Technical Note - The Method of Successive Approximations and Markovian Decision Problems , 1974, Oper. Res..
[77] Arie Hordijk,et al. Dynamic programming and Markov potential theory , 1974 .
[78] Karel Sladký,et al. On the set of optimal controls for Markov chains with rewards , 1974, Kybernetika.
[79] Helmut Schellhaas,et al. Zur Extrapolation in Markoffschen Entscheidungsmodellen mit Diskontierung , 1974, Z. Oper. Research.
[80] Sheldon M. Ross,et al. Dynamic programming and gambling models , 1974, Advances in Applied Probability.
[81] J Jaap Wessels,et al. Discounted semi-Markov decision processes : linear programming and policy iteration , 1975 .
[82] J. Wessels,et al. A principle for generating optimization procedures for discounted Markov decision processes , 1974 .
[83] David Michael Burley,et al. Studies in optimization , 1974 .
[84] A. Hordijk,et al. A MODIFIED FORM OF THE ITERATIVE METHOD OF DYNAMIC PROGRAMMING , 1975 .
[85] Evan L. Porteus. Bounds and Transformations for Discounted Finite Markov Decision Chains , 1975, Oper. Res..
[86] J. Gani,et al. Progress in statistics , 1975 .
[87] J. Shapiro. Brouwer's fixed point theorem and finite state space Markovian decision theory , 1975 .
[88] A. Hordijk,et al. On a Conjecture of Iglehart , 1975 .
[89] Jo van Nunen,et al. A set of successive approximation methods for discounted Markovian decision problems , 1976, Math. Methods Oper. Res..
[90] Dimitri P. Bertsekas,et al. On error bounds for successive approximation methods , 1976 .
[91] N. Hastings,et al. Note---A Test for Nonoptimal Actions in Undiscounted Finite Markov Decision Chains , 1976 .
[92] J.A.E.E. van Nunen,et al. The action elimination algorithm for Markov decision processes , 1976 .
[93] Chelsea C. White,et al. Procedures for the Solution of a Finite-Horizon, Partially Observed, Semi-Markov Optimization Problem , 1976, Oper. Res..
[94] J. A. E. E. van Nunen. Contracting Markov decision processes , 1976 .
[95] Dieter Reetz,et al. A decision exclusion algorithm for a class of Markovian Decision Processes , 1976, Math. Methods Oper. Res..
[96] John G. Kemeny,et al. Finite Markov chains , 1960 .
[97] Michael Scriabin,et al. Maintenance Scheduling for Multicomponent Equipment , 1977 .
[98] G. Hübner. Improved Procedures for Eliminating Suboptimal Actions in Markov Programming by the Use of Contraction Properties , 1977 .
[99] P. Schweitzer,et al. DISCOUNTED AND UNDISCOUNTED VALUE-ITERATION IN MARKOV DECISION PROBLEMS: A SURVEY , 1977 .
[100] van der J Jan Wal,et al. Successive approximations for convergent dynamic programming , 1977 .
[101] Dimitri P. Bertsekas,et al. Dynamic Programming and Stochastic Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[102] Loren Platzman,et al. Technical Note - Improved Conditions for Convergence in Undiscounted Markov Renewal Programming , 1977, Oper. Res..
[103] Paul J. Schweitzer,et al. The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems , 1977, Math. Oper. Res..
[104] D. White. ELIMINATION OF NON-OPTIMAL ACTIONS IN MARKOV DECISION PROCESSES , 1978 .
[105] P. Schweitzer,et al. Foolproof convergence in multichain Policy Iteration , 1978 .
[106] Evan L. Porteus,et al. Technical Note - Accelerated Computation of the Expected Discounted Return in a Markov Chain , 1978, Oper. Res..
[107] Paul J. Schweitzer,et al. The Functional Equations of Undiscounted Markov Renewal Programming , 1971, Math. Oper. Res..
[108] Kees M. van Hee,et al. Markov Strategies in Dynamic Programming , 1978, Math. Oper. Res..
[109] M. Puterman,et al. Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .
[110] P. Schweitzer. Contraction mappings underlying undiscounted Markov decision problems—II , 1978 .
[111] Donald R. Smith. Optimal Repairman Allocation—Asymptotic Results , 1978 .
[112] Martin L. Puterman,et al. Contracting Markov Decision Processes. (Mathematical Centre Tract 71.) , 1978 .
[113] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[114] Martin L. Puterman,et al. Dynamic Programming and Its Application , 1979 .
[115] J Jaap Wessels,et al. Markov Decision Theory , 1979 .
[116] Uriel G. Rothblum,et al. Overtaking Optimality for Markov Decision Chains , 1979, Math. Oper. Res..
[117] Uriel G. Rothblum,et al. Optimal stopping, exponential utility, and linear programming , 1979, Math. Program..
[118] S. Christian Albright,et al. Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..
[119] N. Hastings,et al. Markov programming with policy constraints , 1979 .
[120] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[121] Martin L. Puterman,et al. On the Convergence of Policy Iteration in Stationary Dynamic Programming , 1979, Math. Oper. Res..
[122] A. Hordijk,et al. Linear Programming and Markov Decision Chains , 1979 .
[123] P. Schweitzer,et al. Geometric convergence of value-iteration in multichain Markov decision problems , 1979, Advances in Applied Probability.
[124] Awi Federgruen,et al. A New Specification of the Multichain Policy Iteration Algorithm in Undiscounted Markov Renewal Programs , 1980 .
[125] J. Wal. The method of value oriented successive approximations for the average reward Markov decision process , 1980 .
[126] P. Whittle. Multi‐Armed Bandits and the Gittins Index , 1980 .
[127] Evan L. Porteus. Improved iterative computation of the expected discounted return in Markov and semi-Markov chains , 1980, Z. Oper. Research.
[128] Nagata Furukawa,et al. Characterization of Optimal Policies in Vector-Valued Markovian Decision Processes , 1980, Math. Oper. Res..
[129] Anthony Ephremides,et al. A simple dynamic routing problem , 1980 .
[130] K. Ohno. A UNIFIED APPROACH TO ALGORITHMS WITH A SUBOPTIMALITY TEST IN DISCOUNTED SEMI-MARKOV DECISION PROCESSES , 1981 .
[131] Dieter Spreen,et al. A further anticycling rule in multichain policy iteration for undiscounted Markov renewal programs , 1981, Z. Oper. Research.
[132] L. Thomas. Second order bounds for Markov Decision Processes , 1981 .
[133] Evan L. Porteus. Computing the discounted return in markov and semi‐markov chains , 1981 .
[134] Martin L. Puterman,et al. Computational methods for Markov decision processes , 1981 .
[135] Greg N. Frederickson,et al. Sequencing Tasks with Exponential Service Times to Minimize the Expected Flow Time or Makespan , 1981, JACM.
[136] Matthew J. Sobel,et al. Myopic Solutions of Markov Decision Processes and Stochastic Games , 1981, Oper. Res..
[137] Y. S. Sherif,et al. Optimal maintenance models for systems subject to failure–A Review , 1981 .
[138] Martin L. Puterman,et al. Action Elimination Procedures for Modified Policy Iteration Algorithms , 1982, Oper. Res..
[139] Daniel P. Heyman,et al. Stochastic models in operations research , 1982 .
[140] R. Weber. Scheduling jobs by stochastic processing requirements on parallel machines to minimize makespan or flowtime , 1982, Journal of Applied Probability.
[141] Mohammad Roosta,et al. Routing through a network with maximum reliability , 1982 .
[142] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[143] Gideon Weiss,et al. Multiserver Stochastic Scheduling , 1982 .
[144] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .
[145] L. C. M. Kallenberg,et al. Linear Programming to Compute a Bias-Optimal Policy , 1982 .
[146] Awi Federgruen,et al. Markovian control problems : functional equations and algorithms , 1983 .
[147] M. I. Henig. Vector-Valued Dynamic Programming , 1983 .
[148] R. Hartley,et al. Optimisation Over Time: Dynamic Programming and Stochastic Control: , 1983 .
[149] Volker Nollau,et al. Markov decision problems with countable state spaces : optimality criteria, algorithms, clustering , 1983 .
[150] Masami Kurano. Adaptive Policies in Markov Decision Processes with Uncertain Transition Matrices , 1983 .
[151] L. C. M. Kallenberg,et al. Linear programming and finite Markovian control problems , 1984 .
[152] Kevin Mahon,et al. Deterministic and Stochastic Scheduling , 1983 .
[153] Diethard Pallaschke,et al. Selected Topics in Operations Research and Mathematical Economics , 1984 .
[154] Arie Hordijk,et al. Transient policies in discrete dynamic programming: Linear programming including suboptimality tests and additional constraints , 1984, Math. Program..
[155] Ulrich D. Holzbaur,et al. Entscheidungsmodelle über angeordneten Körpern , 1984 .
[156] Shmuel Gal. An $O(N^3 )$ Algorithm for Optimal Replacement Problems , 1984 .
[157] P. Schweitzer,et al. A Fixed Point Approach to Undiscounted Markov Renewal Programs , 1984 .
[158] Paul J. Schweitzer,et al. Successive Approximation Methods for Solving Nested Functional Equations in Markov Decision Problems , 1984, Math. Oper. Res..
[159] P. R. Kumar,et al. Optimal control of a queueing system with two heterogeneous servers , 1984 .
[160] Moshe Haviv,et al. Truncated policy iteration methods , 1984 .
[161] Arie Hordijk,et al. Constrained Undiscounted Stochastic Dynamic Programming , 1984, Math. Oper. Res..
[162] Awi Federgruen,et al. An Efficient Algorithm for Computing Optimal (s, S) Policies , 1984, Oper. Res..
[163] Michael N. Katehakis,et al. Optimal Repair Allocation in a Series System , 1984, Math. Oper. Res..
[164] Paul J. Schweitzer,et al. A value-iteration scheme for undiscounted multichain Markov renewal programs , 1984, Z. Oper. Research.
[165] Norbert J. Schmitz,et al. How good is Howard's policy improvement algorithm? , 1985, Z. Oper. Research.
[166] D. J. White,et al. Real Applications of Markov Decision Processes , 1985 .
[167] Armand M. Makowski,et al. K competing queues with geometric service requirements and linear costs: The μc-rule is always optimal☆ , 1985 .
[168] Jr. Shaler Stidham. Optimal control of admission to a queueing system , 1985 .
[169] Jean Walrand,et al. Extensions of the multiarmed bandit problem: The discounted case , 1985 .
[170] Rommert Dekker,et al. Sensitivity-analysis in discounted Markovian decision problems , 1985 .
[171] F. Beutler,et al. Optimal policies for controlled markov chains with a constraint , 1985 .
[172] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .
[173] M. J. Sobel. Maximal mean/standard deviation ratio in an undiscounted MDP , 1985 .
[174] Michael N. Katehakis,et al. Linear Programming for Finite State Multi-Armed Bandit Problems , 1986, Math. Oper. Res..
[175] Lodewijk C. M. Kallenberg,et al. A Note on M. N. Katehakis' and Y.-R. Chen's Computation of the Gittins Index , 1986, Math. Oper. Res..
[176] J. Tsitsiklis. A lemma on the multiarmed bandit problem , 1986 .
[177] Jerzy A. Filar,et al. Multiobjective Markov decision process with average reward criterion , 1986 .
[178] U. Meister,et al. A polynomial time bound for Howard's policy improvement algorithm , 1986 .
[179] Lyn C. Thomas,et al. Computational comparison of policy iteration algorithms for discounted markov decision processes , 1986, Comput. Oper. Res..
[180] R. B. Kulkarni,et al. Linear programming formulations of Markov decision processes , 1986 .
[181] Henk Tijms,et al. Stochastic modelling and analysis: a computational approach , 1986 .
[182] U. Holzbaur. Sensitivitätsanalysen in entscheidungsmodellen 1 , 1986 .
[183] K.-J. Bierth. An expected average reward criterion , 1987 .
[184] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[185] H. Kawai. A variance minimization problem for a Markov decision process , 1987 .
[186] R. Weber,et al. Optimal control of service rates in networks of queues , 1987, Advances in Applied Probability.
[187] VARIANCE CONSTRAINED MARKOV DECISION PROCESS , 1987 .
[188] O. J. Vrieze,et al. Stochastic Games with Finite State and Action Spaces. , 1988 .
[189] William S. Lovejoy,et al. Some Monotonicity Results for Partially Observed Markov Decision Processes , 1987, Oper. Res..
[190] Michael N. Katehakis,et al. The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..
[191] P. Schweitzer. A Brouwer fixed-point mapping approach to communicating Markov decision processes , 1987 .
[192] D. J. White,et al. Further Real Applications of Markov Decision Processes , 1988 .
[193] M. Yasuda. The optimal value of markov stopping problems with one-step look ahead policy , 1988, Journal of Applied Probability.
[194] D. White. Mean, variance, and probabilistic criteria in finite Markov decision processes: A review , 1988 .
[195] Süleyman Özekici. Optimal Periodic Replacement of Multicomponent Reliability Systems , 1988, Oper. Res..
[196] J. Ben Atkinson,et al. An Introduction to Queueing Networks , 1988 .
[197] G. Hübner. A unified approach to adaptive control of average reward Markov decision processes , 1988 .
[198] Gideon Weiss,et al. Branching Bandit Processes , 1988, Probability in the Engineering and Informational Sciences.
[199] M. K rn,et al. Stochastic Optimal Control , 1988 .
[200] J. Stein. On efficiency of linear programming applied to discounted Markovian decision problems , 1988 .
[201] Jerzy A. Filar,et al. Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..
[202] Chelsea C. White,et al. Solution Procedures for Partially Observed Markov Decision Processes , 1989, Oper. Res..
[203] Michael N. Katehakis,et al. On the maintenance of systems composed of highly reliable components , 1989 .
[204] Keith W. Ross,et al. Randomized and Past-Dependent Policies for Markov Decision Processes with Multiple Constraints , 1989, Oper. Res..
[205] F. A. van der Duyn Schouten,et al. Analysis and computation of (n,N) : Strategies for maintenance of a two-component system , 1989 .
[206] Kun-Jen Chung. A note on maximal mean/standard deviation ratio in an undiscounted MDP , 1989 .
[207] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .
[208] M. K. Ghosh. Markov decision processes with multiple costs , 1990 .
[209] Sjur Didrik Flåm,et al. A bisection/successive approximation method for computing Gittins indices , 1990, ZOR Methods Model. Oper. Res..
[210] D. Preßmar,et al. Operations research proceedings , 1990 .
[211] M. Puterman,et al. An improved algorithm for solving communicating average reward Markov decision processes , 1991 .
[212] Eitan Altman,et al. Sensitivity of constrained Markov decision processes , 1991, Ann. Oper. Res..
[213] R. W. Owen,et al. New results for generalized bandit problems , 1991 .
[214] Keith W. Ross,et al. Multichain Markov Decision Processes with a Sample Path Constraint: A Decomposition Approach , 1991, Math. Oper. Res..
[215] Chelsea C. White,et al. A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..
[216] E. Altman,et al. Adaptive control of constrained Markov chains: Criteria and policies , 1991 .
[217] A. Shwartz,et al. Adaptive control of constrained Markov chains , 1991 .
[218] Charles J. Colbourn. Combinatorial aspects of network reliability , 1991, Ann. Oper. Res..
[219] Refael Hassin. Multiterminal xcut problems , 1991, Ann. Oper. Res..
[220] Steven I. Marcus,et al. On the computation of the optimal cost function for discrete time Markov models with partial observations , 1991, Ann. Oper. Res..
[221] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[222] Awi Federgruen,et al. Finding Optimal (s, S) Policies Is About As Simple As Evaluating a Single Policy , 1991, Oper. Res..
[223] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[224] Ulrich Rieder,et al. Structural results for partially observed control models , 1991, ZOR Methods Model. Oper. Res..
[225] R. Cavazos-Cadena. Nonparametric estimation and adaptive control in a class of finite Markov decision chains , 1991 .
[226] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[227] Paul J. Schweitzer,et al. Block-scaling of value-iteration for discounted Markov renewal programming , 1991, Ann. Oper. Res..
[228] K. Wakuta. Optimal stationary policies in the vector-valued Markov decision process , 1992 .
[229] R. Weber. On the Gittins Index for Multiarmed Bandits , 1992 .
[230] Keith W. Ross,et al. Variability Sensitive Markov Decision Processes , 1992, Math. Oper. Res..
[231] K. Ohno,et al. Multiobjective undiscounted Markov renewal program and its application to a tool replacement problem in an FMS , 1992 .
[232] Kun-Jen Chung. Remarks on maximal meanstandard devition ratio in undiscounted mdps , 1992 .
[233] K. Ohno,et al. Multi-objective discounted Markov decision processes with expectation and variance criteria , 1992 .
[234] L. Kallenberg. Separable Markovian decision problems , 1992 .
[235] D. J. White. Computational approaches to variance-penalised Markov decision processes , 1992 .
[236] E. Frostig. Optimal policies for machine repairmen problems , 1993 .
[237] M. Sun. Revised simplex algorithm for finite Markov decision processes , 1993 .
[238] Shaler Stidham,et al. A survey of Markov decision models for control of networks of queues , 1993, Queueing Syst. Theory Appl..
[239] D. J. White,et al. A Survey of Applications of Markov Decision Processes , 1993 .
[240] Dimitris Bertsimas,et al. Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems , 2011, IPCO.
[241] U. Holzbaur. Bounds for the quality and the number of steps in Bellman's value iteration algorithm , 1994 .
[242] U. Yechiali,et al. Accelerating Procedures of the Value Iteration Algorithm for Discounted Markov Decision Processes, Based on a One-Step Lookahead Analysis , 1994 .
[243] J. Lasserre. A new policy iteration scheme for Markov decision processes using Schweitzer's formula , 1994, Journal of Applied Probability.
[244] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[245] Kun-Jen Chung. Mean-Variance Tradeoffs in an Undiscounted MDP: The Unichain Case , 1994, Oper. Res..
[246] Chelsea C. White,et al. Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..
[247] P. Varaiya,et al. Multi-Armed bandit problem revisited , 1994 .
[248] Arie Hordijk,et al. Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy , 1994, Math. Methods Oper. Res..
[249] Moshe Shaked,et al. Stochastic orders and their applications , 1994 .
[250] Jean B. Lasserre,et al. Detecting optimal and non-optimal actions in average-cost Markov decision processes , 1994 .
[251] J. Tsitsiklis. A short proof of the Gittins index theorem , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.
[252] Ying Huang,et al. On Finding Optimal Policies for Markov Decision Chains: A Unifying Framework for Mean-Variance-Tradeoffs , 1994, Math. Oper. Res..
[253] Eugene A. Feinberg,et al. Markov Decision Models with Weighted Discounted Criteria , 1994, Math. Oper. Res..
[254] Matthew J. Sobel,et al. Mean-Variance Tradeoffs in an Undiscounted MDP , 1994, Oper. Res..
[255] D. J. White. A mathematical programming approach to a problem in variance penalised Markov decision processes , 1994 .
[256] Gideon Weiss,et al. The Stochastic Optimality of SEPT in Parallel Machine Scheduling , 1994, Probability in the Engineering and Informational Sciences.
[257] K. Wakuta. Vector-valued Markov decision processes and the systems of linear inequalities , 1995 .
[258] Eitan Altman,et al. The Linear Program approach in multi-chain Markov Decision Processes revisited , 1995, Math. Methods Oper. Res..
[259] K. Glazebrook,et al. On transforming an index for generalised bandit problems , 1995, Journal of Applied Probability.
[260] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[261] D. J. White. A superharmonic approach to solving infinite horizon partially observable Markov decision problems , 1995, Math. Methods Oper. Res..
[262] Dimitri P. Bertsekas,et al. Generic rank-one corrections for value iteration in Markovian decision problems , 1995, Oper. Res. Lett..
[263] Arie Hordijk,et al. Markov Decision Chains , 1996 .
[264] Kevin D. Glazebrook,et al. Reflections on a New Approach to Gittins Indexation , 1996 .
[265] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[266] Eitan Altman,et al. On the value function in constrained control of Markov chains , 1996, Math. Methods Oper. Res..
[267] Kazuyoshi Wakuta,et al. A new class of policies in vector-valued Markov decision processes , 1996 .
[268] Apostolos Burnetas,et al. Optimal Adaptive Policies for Markov Decision Processes , 1997, Math. Oper. Res..
[269] D. Bertsekas. A New Value Iteration method for the Average Cost Dynamic Programming Problem , 1998 .
[270] K. D. Glazebrook,et al. On a new approach to the analysis of complex multi-armed bandits , 1998, Math. Methods Oper. Res..
[271] L. Sennott. Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .
[272] O. Hernández-Lerma,et al. Discrete-time Markov control processes , 1999 .
[273] E. Altman. Constrained Markov Decision Processes , 1999 .
[274] Michael K. Ng. A note on policy algorithms for discounted Markov decision problems , 1999, Oper. Res. Lett..
[275] O. Hernández-Lerma,et al. Further topics on discrete-time Markov control processes , 1999 .
[276] Isaac Sonin,et al. The Elimination algorithm for the problem of optimal stopping , 1999, Math. Methods Oper. Res..
[277] Kazuyoshi Wakuta,et al. A note on the structure of value spaces in vector-valued Markov decision processes , 1999, Math. Methods Oper. Res..
[278] J. Stoer,et al. Introduction to Numerical Analysis , 2002 .
[279] Eric V. Denardo,et al. Dynamic Programming: Models and Applications , 2003 .