An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems