Randomized Linear Programming Solves the Markov Decision Problem in Nearly Linear (Sometimes Sublinear) Time
暂无分享,去创建一个
We propose a novel randomized linear programming algorithm for approximating the optimal policy of the discounted-reward and average-reward Markov decision problems. By leveraging the value–policy ...