Measure valued differentiation for stochastic processes : the finite horizon case

This paper addresses the problem of sensitivity analysis for finite horizon performance measures of general Markov chains. We derive closed form expressions and associated unbiased gradient estimators for derivatives of finite products of Markov kernels by measure-valued differentiation (MVD). In the MVD setting, derivatives of Markov kernels, called D-derivatives, are defined with respect to an appropriately defined class of performance functions D, such that for any performance measure g ∈ D the derivative of the integral of g with respect to the one step transition probability of the Markov chain exists. The MVD approach (1) yields results that that can be applied to performance functions out of a predefined class, (2) allows for a product rule of differentiation, that is, analyzing the derivative of the transition kernel immediately yields finite horizon results, (3) provides an operator language approach to differentiation of Markov chains and (4) clearly identifies the trade-off between the generality of performance classes that can be analyzed and the generality of the classes of measures (Markov kernels). The D-derivative of a measure can be interpreted in terms of various (unbiased) gradient estimators and the product rule for D-differentiation yields a product-rule for various gradient estimators.