Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction