Constraints on the perception of synthetic speech generated by rule

Within the next few years, there will be an extensive proliferation of various types of voice response devices in human-machine communication systems. Unfortunately, at present, relatively little basic or applied research has been carried out on the intelligibility, comprehension, and perceptual processing of synthetic speech produced by these devices. On the basis of our research, we identify five factors that must be considered in studying the perception of synthetic speech: (1) the specific demands imposed by a particular task, (2) the inherent limitations of the human information processing system, (3) the experience and training of the human listener, (4) the linguistic structure of the message set, and (5) the structure and quality of the speech signal.

[1]  Randy L. Diehl,et al.  Feature analyzers for the phonetic dimensionstop vs. continuant , 1976 .

[2]  David B. Pisoni,et al.  Effects of practice on speeded classification of natural and synthetic speech , 1982 .

[3]  G. A. Miller,et al.  An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .

[4]  K. D. Kryter,et al.  ARTICULATION-TESTING METHODS: CONSONANTAL DIFFERENTIATION WITH A CLOSED-RESPONSE SET. , 1965, The Journal of the Acoustical Society of America.

[5]  E C Schwab,et al.  Some Effects of Training on the Perception of Synthetic Speech , 1985, Human factors.

[6]  A. Salasoo,et al.  Interaction of Knowledge Sources in Spoken Word Identification. , 1985, Journal of memory and language.

[7]  Howard C. Nusbaum,et al.  Intelligibility of fluent synthetic sentences: Effects of speech rate, pitch contour, and meaning , 1983 .

[8]  M. D. Wang,et al.  Consonant confusions in noise: a study of perceptual features. , 1973, The Journal of the Acoustical Society of America.

[9]  J. P. Egan Articulation testing methods , 1948, The Laryngoscope.

[10]  F Grosjean,et al.  Spoken word recognition processes and the gating paradigm , 1980, Perception & psychophysics.

[11]  David B. Pisoni Speeded classification of natural and synthetic speech in a lexical decision task , 1981 .

[12]  T. Feustel,et al.  Capacity Demands in Short-Term Memory for Synthetic and .Natural Speech , 1983, Human factors.

[13]  David B. Pisoni,et al.  Perceptual evaluation of MITalk: The MIT unrestricted text-to-speech system , 1980, ICASSP.

[14]  David B. Pisoni,et al.  Capacity demands in short‐term memory for synthetic and natural word lists , 1981 .