Optimal Switching Problem for Markov Chains

We consider the following multi-step version of the optimal stopping problem. There is a Markov chain {x t} with a Borel state space X, and there are two functions f < g defined on X; one may interpret f (x t ) and g (x t ) as the selling price and the purchase price of an asset at the epoch t. A controller selects a sequence of stopping times τ1 ≤ τ2 ≤… and can be either in a position to sell or in a position to buy the asset. By selecting τ = τ k , the controller, depending on the current position, either gets a reward f (x τ ) or pays a cost g (x τ ), and becomes switched to the opposite position. The control process terminates at an absorbing boundary, and the problem is to maximize the expected total rewards minus costs.