POMO: Policy Optimization with Multiple Optima for Reinforcement Learning