论文信息 - Singularly Perturbed Markov Decision Processes: A Multiresolution Algorithm

Singularly Perturbed Markov Decision Processes: A Multiresolution Algorithm

Singular perturbation techniques allow the derivation of an aggregate model whose solution is asymptotically optimal for Markov decision processes with strong and weak interactions. We develop an algorithm that takes advantage of the asymptotic optimality of the aggregate model in order to compute the solution of the original model. We derive conditions for which the proposed algorithm has better worst case complexity than conventional contraction algorithms. Based on our complexity analysis, we show that the major benefit of aggregation is that the reduced order model is no longer ill conditioned. The reduction in the number of states (due to aggregation) is a secondary benefit. This is a surprising result since intuition would suggest that the reduced order model can be solved more efficiently because it has fewer states. However, we show that this is not necessarily the case. Our theoretical analysis and numerical experiments show that the proposed algorithm can compute the optimal solution with a redu...

Panos Parpas | Chin Pang Ho

[1] Phillipp Kaestner,et al. Hierarchical Decision Making In Stochastic Manufacturing Systems , 2016 .

[2] B. Craven. Control and optimization , 2019, Mathematical Modelling of the Human Cardiovascular System.

[3] Benjamin Van Roy,et al. On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..

[4] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[5] Carsten Hartmann,et al. Optimal control of molecular dynamics using Markov state models , 2012, Math. Program..

[6] Giuseppe Carlo Calafiore,et al. Uncertain convex programs: randomized solutions and confidence levels , 2005, Math. Program..

[7] Panos Parpas,et al. A stochastic multiscale model for electricity generation capacity expansion , 2013, Eur. J. Oper. Res..

[8] Herbert A. Simon,et al. Aggregation of Variables in Dynamic Systems , 1961 .

[9] W. Marsden. I and J , 2012 .

[10] Wolfgang Hackbusch,et al. Multi-grid methods and applications , 1985, Springer series in computational mathematics.

[11] Giuseppe Carlo Calafiore,et al. Random Convex Programs , 2010, SIAM J. Optim..

[12] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[13] Martino Bardi,et al. MULTISCALE SINGULAR PERTURBATIONS AND HOMOGENIZATION OF OPTIMAL CONTROL PROBLEMS , 2007 .

[14] William L. Briggs,et al. A multigrid tutorial , 1987 .

[15] Hassan K. Khalil,et al. Singular perturbation methods in control : analysis and design , 1986 .

[16] J. Tsitsiklis,et al. An optimal one-way multigrid algorithm for discrete-time stochastic control , 1991 .

[17] Qing Zhang,et al. Continuous-Time Markov Chains and Applications , 1998 .

[18] Sean P. Meyn. Control Techniques for Complex Networks: Workload , 2007 .

[19] Philipp Birken,et al. Numerical Linear Algebra , 2011, Encyclopedia of Parallel Computing.

[20] Antonios Armaou,et al. Control and optimization of multiscale process systems , 2006, Comput. Chem. Eng..