A fuzzy approach to Markov decision processes with uncertain transition probabilities

In this paper, Markov decision models with uncertain transition matrices, which allow a matrix to fluctuate at each step in time, is described by the use of fuzzy sets. We find a Pareto optimal policy maximizing the infinite horizon fuzzy expected discounted reward (FEDR) over all stationary policies under some partial order. The Pareto optimal policies are characterized by maximal solutions of an optimal inclusion including efficient set-functions. As a numerical example, a machine maintenance problem is considered.

[1]  Darald J. Hartfiel,et al.  Markov Set-Chains , 1998 .

[2]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[3]  E. Seneta,et al.  On the theory of Markov set-chains , 1994, Advances in Applied Probability.

[4]  Marcel Stoica,et al.  Fuzzy sets and their applications , 2008 .

[5]  Yuji Yoshida A Time-Average Fuzzy Reward Criterion in Fuzzy Decision Processes , 1998, Inf. Sci..

[6]  A. Neumaier New techniques for the analysis of linear interval equations , 1984 .

[7]  Rudolf Kruse,et al.  Processor power considerations-An application of fuzzy Markov chains , 1987 .

[8]  Masami Yasuda,et al.  Markov-Type Fuzzy Decision Processes with a Discounted Reward on a Closed Interval(Mathematical Structure of Optimization Theory) , 1994 .

[9]  Masanori Hosaka,et al.  CONTROLLED MARKOV SET-CHAINS WITH DISCOUNTING , 1998 .

[10]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[11]  Masanori Hosaka,et al.  NON-DISCOUNTED OPTIMAL POLICIES IN CONTROLLED MARKOV SET-CHAINS , 1999 .

[12]  D. White Multi-objective infinite-horizon discounted Markov decision processes , 1982 .

[13]  M. Kurano,et al.  Interval Methods for Uncertain Markov Decision Processes , 2002 .

[14]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[15]  Nagata Furukawa,et al.  Characterization of Optimal Policies in Vector-Valued Markovian Decision Processes , 1980, Math. Oper. Res..

[16]  Rutherford Aris,et al.  Discrete Dynamic Programming , 1965, The Mathematical Gazette.