Attention Is Not Only a Weight: Analyzing Transformers with Vector Norms