Data Movement Is All You Need: A Case Study on Optimizing Transformers