论文信息 - Finding the K best policies in a finite-horizon Markov decision process

Finding the K best policies in a finite-horizon Markov decision process

Abstract Directed hypergraphs represent a general modelling and algorithmic tool, which have been successfully used in many different research areas such as artificial intelligence, database systems, fuzzy systems, propositional logic and transportation networks. However, modelling Markov decision processes using directed hypergraphs has not yet been considered. In this paper we consider finite-horizon Markov decision processes ( MDPs ) with finite state and action space and present an algorithm for finding the K best deterministic Markov policies. That is, we are interested in ranking the first K deterministic Markov policies in non-decreasing order using an additive criterion of optimality. The algorithm uses a directed hypergraph to model the finite-horizon MDP. It is shown that the problem of finding the optimal policy can be formulated as a minimum weight hyperpath problem and be solved in linear time, with respect to the input data representing the MDP, using different additive optimality criteria.

Lars Relund Nielsen | Anders Ringgaard Kristensen

[1] Lodewijk C. M. Kallenberg. Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications , 1994, Math. Methods Oper. Res..

[2] Daniele Frigioni,et al. Directed Hypergraphs: Problems, Algorithmic Results, and a Novel Decremental Approach , 2001, ICTCS.

[3] Erik Jørgensen,et al. Multi‐level hierarchic Markov processes as a framework for herd management support , 2000, Ann. Oper. Res..

[4] Thomas W. Reps,et al. An Incremental Algorithm for a Generalization of the Shortest-Path Problem , 1996, J. Algorithms.

[5] Anders Ringgaard Kristensen,et al. Hierarchic Markov processes and their applications in replacement models , 1988 .

[6] Giuseppe F. Italiano,et al. Hypergraph Traversal Revisited: Cost Measures and Dynamic Algorithms , 1998, MFCS.

[7] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .

[8] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[9] C. Derman,et al. Some Remarks on Finite Horizon Markovian Decision Models , 1965 .

[10] Giorgio Gallo,et al. Directed Hypergraphs and Applications , 1993, Discret. Appl. Math..

[11] A. Bonato,et al. Graphs and Hypergraphs , 2022 .