Speech Synthesis from Text

In this paper, we are concerned with describing a successful approach to the conversion of unrestricted English text to speech. Before taking up the details of this process, however, it is useful to place this task in context. Over the years, there has been an increasing need for speech generated from computers. In part, this has been due to the intrinsic nature of text, speech, and computing. Certainly speech is the fundamental language representation, present in all cultures (whether literate or not), so that if there is to be any communication means between the computer and its human users, then speech provides the most broadly useful modality, except for the needs of the deaf. While text (considered as a string of conventional symbols) is often considered to be more durable than speech and more reliably preserved, this is in many ways a manifestation of relatively early progress in printing technology, as opposed to the technology available for storing and manipulating speech. Furthermore, text-based interaction with computers requires typing (and often reading) skills which many potential users do not possess.

[1]  J. Allen,et al.  Synthesis of speech from unrestricted text , 1976, Proceedings of the IEEE.