A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games