Discretizing Continuous Action Space for On-Policy Optimization