Faster non-convex federated learning via global and local momentum