Multi-Step Truncated Q Learning Algorithm

Q learning is of great importance in reinforcement learning. To compensate the drawbacks of Q learning and Q(λ) algorithm, MTQ algorithm is proposed in this paper. It makes use of future information of k steps to update current Q value. Thus it can consider more long-term benefit and the computation complexity is also decreased. Good balance is achieved between update speed and computation complexity. Experiments demonstrate effectiveness of this algorithm.