论文信息 - The effects of voice type and quality on the intelligibility of a text-to-speech system

The effects of voice type and quality on the intelligibility of a text-to-speech system

In developing a text-to-speech system, it is important to consider not only how intelligible the final output may be, but also how acceptable the synthetic voice is to the users of such a system. We might even go so far as to consider offering a range of different types of voice, some male, some female, to allow the user the flexibility they may require. It is not clear, however, just how different voices and different voice qualities may affect the intelligibility of the text-to-speech system. Nor indeed, is it clear whether generating different voice types might involve simple modification of the basic synthetic voice offered, or require the extraction of a whole new parameter set. This paper addresses three basic issues: first, can we generate a number of different voices by modifying a single voice; second, how acceptable do naive listeners judge those voices to be; and finally, what, if any, is the effect of changing the voice type on the intelligibility of the synthetic output.

J. Brian Pickering

[1] David B. Pisoni,et al. Text-to-speech: the mitalk system , 1987 .