Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games