11-041 Optimistic planning for sparsely stochastic systems ∗

We describe an online planning algorithm for finiteaction, sparsely stochastic Markov decision processes, in which the random state transitions can only end up in a small number of possible next states. The algorithm builds a planning tree by iteratively expanding states, where the most promising states are expanded first, in anoptimistic procedure aiming to return a good action after a strictly limited number of expansions. The novel algorithm is calledoptimistic planning for sparsely stochastic systems.