Momentum-Based Variance Reduction in Non-Convex SGD