A Hybrid Genetic/Optimization Algorithm for Finite-Horizon, Partially Observed Markov Decision Processes

The partially observed Markov decision process (POMDP) is a generalization of a Markov decision process that allows for noise-corrupted and costly observations of the underlying system state. The value function of the infinite horizon POMDP is known to be piecewise affine and convex in the probability mass vector over the state space. Such a function can be represented by a finite set of affine functions.In this paper, we develop and evaluate an exact algorithm, GAMIP, which combines a genetic algorithm and a mixed integer program to construct the minimal set of affine functions that describes the value function. Numerical results indicate that GAMIP takes up to 60% less time to construct the minimal set than does the most efficient linear programming-based exact solution method in the literature.

[1]  E. J. Sondik,et al.  The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .

[2]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[3]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[4]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[5]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[6]  Chelsea C. White,et al.  Decision Aid Development for Use in Ambulatory Health Care Settings , 1982, Oper. Res..

[7]  James N. Eagle The Optimal Search for a Moving Target When the Search Path Is Constrained , 1984, Oper. Res..

[8]  P. Strevens Iii , 1985 .

[9]  D. J. White,et al.  Real Applications of Markov Decision Processes , 1985 .

[10]  David E. Goldberg,et al.  Genetic Algorithms with Sharing for Multimodalfunction Optimization , 1987, ICGA.

[11]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[12]  D. J. White,et al.  Further Real Applications of Markov Decision Processes , 1988 .

[13]  Hsien-Te Cheng,et al.  Algorithms for partially observable markov decision processes , 1989 .

[14]  Martina Gorges-Schleuter,et al.  Explicit Parallelism of Genetic Algorithms through Population Structures , 1990, PPSN.

[15]  Chelsea C. White,et al.  A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..

[16]  Yuval Davidor,et al.  A Naturally Occurring Niche and Species Phenomenon: The Model and First Results , 1991, ICGA.

[17]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[18]  D. White Piecewise Linear Approximations for Partially Observable Markov Decision Processes with Finite Horizons , 1992 .

[19]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[20]  James C. Bean,et al.  Genetic Algorithms and Random Keys for Sequencing and Optimization , 1994, INFORMS J. Comput..

[21]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[22]  Stephen P. Brooks,et al.  Markov Decision Processes. , 1995 .

[23]  James C. Bean,et al.  Operation scheduling for parallel machine tools , 1995 .

[24]  M. Littman,et al.  Efficient dynamic-programming updates in partially observable Markov decision processes , 1995 .

[25]  Wenju Liu,et al.  Planning in Stochastic Domains: Problem Characteristics and Approximation , 1996 .

[26]  Samir W. Mahfoud Niching methods for genetic algorithms , 1996 .

[27]  Michael L. Littman,et al.  Algorithms for Sequential Decision Making , 1996 .

[28]  James C. Bean,et al.  A Genetic Algorithm for the Multiple-Choice Integer Program , 1997, Oper. Res..

[29]  Michael L. Littman,et al.  Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[30]  A. Cassandra,et al.  Exact and approximate algorithms for partially observable markov decision processes , 1998 .

[31]  J. C. Bean,et al.  A GENETIC ALGORITHM METHODOLOGY FOR COMPLEX SCHEDULING PROBLEMS , 1999 .

[32]  Katta G. Murty,et al.  A hybrid genetic/optimization algorithm for a task allocation problem , 1999 .