Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding