Attention is Not Only a Weight: Analyzing Transformers with Vector Norms