论文信息 - Measure-Valued Differentiation for Markov Chains

Measure-Valued Differentiation for Markov Chains

Abstract This paper addresses the problem of sensitivity analysis for finite-horizon performance measures of general Markov chains. We derive closed-form expressions and associated unbiased gradient estimators for the derivatives of finite products of Markov kernels by measure-valued differentiation (MVD). In the MVD setting, the derivatives of Markov kernels, called $\mathcal{D}$ -derivatives, are defined with respect to a class of performance functions $\mathcal{D}$ such that, for any performance measure $g\in\mathcal{D}$ , the derivative of the integral of g with respect to the one-step transition probability of the Markov chain exists. The MVD approach (i) yields results that can be applied to performance functions out of a predefined class, (ii) allows for a product rule of differentiation, that is, analyzing the derivative of the transition kernel immediately yields finite-horizon results, (iii) provides an operator language approach to the differentiation of Markov chains and (iv) clearly identifies the trade-off between the generality of the performance classes that can be analyzed and the generality of the classes of measures (Markov kernels). The $\mathcal{D}$ -derivative of a measure can be interpreted in terms of various (unbiased) gradient estimators and the product rule for $\mathcal {D}$ -differentiation yields a product-rule for various gradient estimators.

F. Vázquez-Abad | B. Heidergott

[1] J. M. Hammersley,et al. Conditional Monte Carlo , 1956, JACM.

[2] P. Meyer,et al. Probabilities and potential C , 1978 .

[3] Paul Glasserman,et al. Gradient Estimation Via Perturbation Analysis , 1990 .

[4] Xi-Ren Cao,et al. Perturbation analysis of discrete event dynamic systems , 1991 .

[5] P. Glasserman,et al. Some Guidelines and Guarantees for Common Random Numbers , 1992 .

[6] Xi-Ren Cao,et al. Realization Probabilities: The Dynamics of Queuing Systems , 1994 .

[7] George Ch. Pflug,et al. Optimization of Stochastic Models , 1996 .

[8] Arie Hordijk,et al. Derivatives of Markov Kernels and Their Jordan Decomposition , 2003 .

[9] Arie Hordijk,et al. Single-run gradient estimation via measure-valued differentiation , 2004, IEEE Transactions on Automatic Control.

[10] F. Vázquez-Abad,et al. Measure valued differentiation for random horizon problems , 2006 .