Generating Human Motion from Textual Descriptions with Discrete Representations