Accelerating Tensor Swapping in GPUs With Self-Tuning Compression