Comprehension of Synthetic Speech Produced by Rule: Word Monitoring and Sentence-by-Sentence Listening Times

Previous comprehension studies using postperceptual memory tests have often reported negligible differences in performance between natural speech and several kinds of synthetic speech produced by rule, despite large differences in segmental intelligibility. The present experiments investigated the comprehension of natural and synthetic speech using two different on-line tasks: word monitoring and sentence-by-sentence listening. On-line task performance was slower and less accurate for passages of synthetic speech than for passages of natural speech. Recognition memory performance in both experiments was less accurate following passages of synthetic speech than of natural speech. Monitoring performance, sentence listening times, and recognition memory accuracy all showed moderate correlations with intelligibility scores obtained using the Modified Rhyme Test. The results suggest that poorer comprehension of passages of synthetic speech is attributable in part to the greater encoding demands of synthetic speech. In contrast to earlier studies, the present results demonstrate that on-line tasks can be used to measure differences in comprehension performance between natural and synthetic speech.

[1]  D H Klatt,et al.  Review of text-to-speech conversion for English. , 1987, The Journal of the Acoustical Society of America.

[2]  D. J. Foss,et al.  Text structure and reading time for sentences , 1980 .

[3]  Stephen J. Boies,et al.  Components of attention. , 1971 .

[4]  A. Cutler Phoneme-monitoring reaction time as a function of preceding intonation contour , 1976 .

[5]  David B. Pisoni,et al.  Perceptual evaluation of MITalk: The MIT unrestricted text-to-speech system , 1980, ICASSP.

[6]  D. Pisoni,et al.  Some Effects of Perceptual Load on Spoken Text Comprehension. , 1982, Journal of verbal learning and verbal behavior.

[7]  Astrid McHugh Listener Preference and Comprehension Tests of Stress Algorithms for a Text-to-Phonetic Speech Synthesis Program. , 1976 .

[8]  H. H. Clark,et al.  What's new? Acquiring New information as a process in comprehension , 1974 .

[9]  W. Levelt A survey of studies in sentence perception : 1970-1976 , 1978 .

[10]  A. V. Reed,et al.  List length and the time course of recognition in immediate memory , 1976, Memory & cognition.

[11]  David B Pisoni,et al.  Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems , 1986, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[12]  M. Crawford The Art of Readable Writing , 1969 .

[13]  D. Pisoni,et al.  Effects of target monitoring on understanding fluent speech , 1981, Perception & psychophysics.

[14]  Alan M. Lesgold,et al.  Foregrounding effects in discourse comprehension , 1979 .

[15]  R. Gunning The Technique of Clear Writing. , 1968 .

[16]  Walter Kintsch,et al.  Reading rate and retention as a function of the number of propositions in the base structure of sentences , 1973 .

[17]  D. J. Foss Decision processes during sentence comprehension: Effects of lexical item difficulty and position upon decision times , 1969 .

[18]  James J. Jenkins,et al.  Recall of passages of synthetic speech , 1982 .

[19]  D.B. Pisoni,et al.  Perception of synthetic speech generated by rule , 1985, Proceedings of the IEEE.

[20]  T. Feustel,et al.  Capacity Demands in Short-Term Memory for Synthetic and .Natural Speech , 1983, Human factors.

[21]  S. Jay Samuels,et al.  Toward a theory of automatic information processing in reading , 1974 .

[22]  George A. Miller,et al.  A Chronometric Study of Some Relations between Sentences , 1964 .

[23]  Robert D. Rodman,et al.  THE EFFECTS OF VARIOUS TYPES OF SPEECH OUTPUT ON LISTENER COMPREHENSION RATES , 1987 .

[24]  D B Pisoni,et al.  Segmental intelligibility of synthetic speech produced by rule. , 1989, The Journal of the Acoustical Society of America.

[25]  J Reichle,et al.  The intelligibility of synthesized speech: ECHO II versus VOTRAX. , 1987, Journal of speech and hearing research.

[26]  M A Just,et al.  A theory of reading: from eye fixations to comprehension. , 1980, Psychological review.

[27]  K. D. Kryter,et al.  ARTICULATION-TESTING METHODS: CONSONANTAL DIFFERENTIATION WITH A CLOSED-RESPONSE SET. , 1965, The Journal of the Acoustical Society of America.

[28]  Gavriel Salvendy,et al.  Handbook of human factors. , 1987 .

[29]  E C Schwab,et al.  Some Effects of Training on the Perception of Synthetic Speech , 1985, Human factors.

[30]  David B. Pisoni,et al.  Text-to-speech: the mitalk system , 1987 .

[31]  Arthur C. Graesser,et al.  Structural components of reading time , 1980 .

[32]  Randolph K. Cirilo Referential coherence and text structure in story comprehension , 1981 .

[33]  Bruce K. Britton,et al.  Reading and cognitive capacity usage: Effects of text difficulty. , 1978 .

[34]  Michael J. Dedina,et al.  Comprehension of natural and synthetic speech using a sentence verification task , 1986 .

[35]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .

[36]  D. Kahneman,et al.  Attention and Effort , 1973 .