Supervised Attention in Sequence-to-Sequence Models for Speech Recognition