On-line EM Algorithm and Reinforcement Learning
暂无分享,去创建一个
We previously proposed an on-line EM algorithm for Normalized Gaussian Network (NGnet), which is a network of local linear regression units. In this article, we will apply our approach based on the on-line EM algorithm to reinforcement learning problems. We will examine a task for swinging-up and stabilizing a single pendulum with a limited torque, and a task for stabilizing a double pendulum. As a result, our approach is much more efficient than that based on the gradient descent algorithm.
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[3] John Moody,et al. Speedy alternatives to back propagation , 1988, Neural Networks.