Optimistic Multi-Agent Policy Gradient for Cooperative Tasks