论文信息 - Application of Markov decision processes to search problems - 字舞流文

Application of Markov decision processes to search problems

Many decision problems contain, in some form, a NP-hard combinatorial problem. Therefore decision support systems have to solve such combinatorial problems in a reasonable time. Many combinatorial problems can be solved by a search method. The search methods used in decision support systems have to be robust in the sense that they can handle a large variety of (user defined) constraints and that they allow user interaction, i.e. they allow a decision maker to control the search process manually. In this paper we show how Markov decision processes can be used to guide a random search process. We first formulate search problems as a special class of Markov decision processes such that the search space of a search problem is the state space of the Markov decision process. In general it is not possible to compute an optimal control procedure for these Markov decision processes in a reasonable time. We therefore, define several simplifications of the original problem that have much smaller state spaces. For these simplifications, decompositions and abstractions, we find optimal strategies and use the exact solutions of these simplified problems to guide a randomized search process. The search process selects states for further search at random with probabilities based on the optimal strategies of the simplified problems. This randomization is a substitute for explicit backtracking and avoids problems with local extrema. These randomized search procedures are repeated as long as we have time to solve the problem. The best solution of those generated during that time is accepted. We illustrate the approach with two examples: the N-puzzle and a job shop scheduling problem.

Kees M. van Hee | L. B. Hartman | Leo B. Hartman | K. V. Hee | L. Hartman

[1] Jos C. M. Baeten,et al. A Congruence Theorem for Structured Operational Semantics with Predicates , 1993, CONCUR.

[2] Dick Alstein,et al. Dynamic reconfiguration in distributed hard real-time systems , 1991 .

[3] Gerard Zwaan,et al. A taxonomy of keyword pattern matching algorithms , 1992 .

[4] Dana H. Ballard,et al. Decision theory and the cost of planning , 1990 .

[5] Steven Minton,et al. Quantitative Results Concerning the Utility of Explanation-based Learning , 1988, Artif. Intell..

[6] van Km Kees Hee,et al. Systems engineering : a formal approach. Part I. System concepts , 1993 .

[7] P. D. Moerland,et al. Exercises in multiprogramming , 1993 .

[8] Fairouz Kamareddine,et al. A unified approach to type theory through a refined lambda-calculus , 1992 .

[9] Kenneth Steiglitz,et al. Combinatorial Optimization: Algorithms and Complexity , 1981 .

[10] Judea Pearl,et al. Heuristics : intelligent search strategies for computer problem solving , 1984 .

[11] Richard E. Korf,et al. Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..

[12] J. M. Norman,et al. Heuristic procedures in dynamic programming , 1972 .

[13] Fairouz Kamareddine,et al. A System at the Cross-Roads of Functional and Logic Programming , 1992, Sci. Comput. Program..

[14] Emile H. L. Aarts,et al. Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[15] A. E. Eiben,et al. Global Convergence of Genetic Algorithms: A Markov Chain Analysis , 1990, PPSN.

[16] David E. Goldberg,et al. Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[17] Jan Karel Lenstra,et al. Complexity of Scheduling under Precedence Constraints , 1978, Oper. Res..

[18] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[19] Erik Poll,et al. Some categorical properties for a model for second order lambda calculus with subtyping , 1991 .

[20] Joost-Pieter Katoen,et al. Parallel programs for the recognition of P-invariant segments , 1991 .

[21] D. de Reus,et al. An implementation model for GOOD , 1991 .

[22] R. Strauch. Negative Dynamic Programming , 1966 .

[23] J. Zwiers,et al. Assertional Data Reification Proofs: Survey and Perspective , 1991 .