Fast-PPO: Proximal Policy Optimization with Optimal Baseline Method