Labelling of speech given its text representation