Two-Time Scale Controlled Markov Chains: A Decomposition and Parallel Processing Approach

This correspondence deals with a class of ergodic control problems for systems described by Markov chains with strong and weak interactions. These systems are composed of a set of subchains that are weakly coupled. Using results already available in the literature one formulates a limit control problem the solution of which can be obtained via an associated nondifferentiable convex programming (NDCP) problem. The technique used to solve the NDCP problem is the Analytic Center Cutting Plane Method (ACCPM) which implements a dialogue between, on one hand, a master program computing the analytical center of a localization set containing the solution and, on the other hand, an oracle proposing cutting planes that reduce the size of the localization set at each main iteration. The interesting aspect of this implementation comes from two characteristics: (i) the oracle proposes cutting planes by solving reduced sized Markov Decision Problems (MDP) via a linear program (LP) or a policy iteration method; (ii) several cutting planes can be proposed simultaneously through a parallel implementation on processors. The correspondence concentrates on these two aspects and shows, on a large scale MDP obtained from the numerical approximation ldquoa la Kushner-Dupuisrdquo of a singularly perturbed hybrid stochastic control problem, the important computational speed-up obtained.

[1]  J. Goffin,et al.  Decomposition and nondifferentiable optimization with the projective algorithm , 1992 .

[2]  M. K. Ghosh,et al.  Ergodic Control of Switching Diffusions , 1997 .

[3]  Carl Kesselman,et al.  Generalized communicators in the Message Passing Interface , 1996, Proceedings. Second MPI Developer's Conference.

[4]  Jerzy A. Filar,et al.  A two-factor stochastic production model with two time scales , 2001, Autom..

[5]  Jacek Gondzio,et al.  ACCPM — A library for convex optimization based on an analytic center cutting plane method☆ , 1996 .

[6]  P. Kokotovic,et al.  A singular perturbation approach to modeling and control of Markov chains , 1981 .

[7]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[8]  Claude Tadonki,et al.  Proximal-ACCPM: A Versatile Oracle Based Optimisation Method , 2007 .

[9]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[10]  John N. Tsitsiklis,et al.  A survey of computational complexity results in systems and control , 2000, Autom..

[11]  J. Filar,et al.  Perturbation and stability theory for Markov control problems , 1992 .

[12]  A. S. Manne Linear Programming and Sequential Decisions , 1960 .

[13]  J. Filar,et al.  Optimal Ergodic Control of Singularly Perturbed Hybrid Stochastic Systems , 1997 .

[14]  Jean-Philippe Vial,et al.  Convex nondifferentiable optimization: A survey focused on the analytic center cutting plane method , 2002, Optim. Methods Softw..

[15]  Alain Haurie,et al.  Singularly Perturbed Hybrid Control Systems Approximated by Structured Linear Programs , 2002 .

[16]  Pravin M. Vaidya,et al.  A cutting plane algorithm for convex programming that uses analytic centers , 1995, Math. Program..

[17]  J. Goffin,et al.  Using central prices in the decomposition of linear programs , 1990 .

[18]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[19]  H. Kushner Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .

[20]  G. Dantzig,et al.  The decomposition algorithm for linear programming: notes on linear programming and extensions-part 57. , 1961 .

[21]  Francesco Moresino Stochastic optimization : numerical methods , 1999 .

[22]  G. Dantzig,et al.  THE DECOMPOSITION ALGORITHM FOR LINEAR PROGRAMS , 1961 .

[23]  François Delebecque,et al.  Optimal control of markov chains admitting strong and weak interactions , 1981, Autom..

[24]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[25]  Jerzy A. Filar,et al.  W 2-4150 Aggregation-Disaggregation Algorithm for E 2-Singularly Perturbed Limiting Average Markov Control Problems , 2004 .

[26]  Andreas Griewank,et al.  On constrained optimization by adjoint based quasi-Newton methods , 2002, Optim. Methods Softw..

[27]  Yinyu Ye,et al.  Complexity Analysis of an Interior Cutting Plane Method for Convex Feasibility Problems , 1996, SIAM J. Optim..

[28]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .