Prosodic Phrasing and Comprehension

From previous research we know that prosodic features are perceptually effective in marking boundaries and that a suitable implementation of these features improves the quality of synthetic speech in terms of acceptability. It can further be assumed that listeners use the perceived prosodic information to compute the meaning of the input speech. This paper, therefore, investigates and determines whether a well-phrased utterance, (that is, an utterance with prosodic boundaries in appropriate positions and with appropriate realizations), is easier to comprehend than a poorly-phrased one. To measure this, we designed a method in which a kind of verification task is combined with a question-answering task (“monitoring for the answer”). The stimulus set consisted of structurally ambiguous sentences. The expectation was that when listeners hear a question followed by an appropriately phrased utterance, they will react more rapidly than when the question is followed by an utterance with neutral phrasing. Also, it was expected that in the latter situation reaction times (RTs) will be shorter than if an inappropriately phrased utterance is presented. The results confirmed the expectations: an appropriately phrased utterance always produced the fastest RTs.

[1]  Kim E. A. Silverman,et al.  Evaluating synthesiser performance: is segmental intelligibility enough? , 1990, ICSLP.

[2]  C. Frankish Intonation and auditory grouping in immediate serial recall , 1995 .

[3]  Kim E. A. Silverman,et al.  Evaluating the overall comprehensibility of speech synthesizers , 1992, ICSLP.

[4]  W. Levelt A survey of studies in sentence perception : 1970-1976 , 1978 .

[5]  A. Cutler Phoneme-monitoring reaction time as a function of preceding intonation contour , 1976 .

[6]  P. A. van Rijnsoever A multilingual text-to-speech system , 1988 .

[7]  T. Feustel,et al.  Capacity Demands in Short-Term Memory for Synthetic and .Natural Speech , 1983, Human factors.

[8]  Jacques M. B. Terken Synthesizing natural-sounding intonation for Dutch: rules and perceptual evaluation , 1993, Comput. Speech Lang..

[9]  David B. Pisoni,et al.  Capacity demands in short‐term memory for synthetic and natural word lists , 1981 .

[10]  David B Pisoni,et al.  Comprehension of natural and synthetic speech: effects of predictability on the verification of sentences controlled for intelligibility. , 1987, Computer speech & language.

[11]  D. J. Foss,et al.  Decision processes during sentence comprehension: Effects of surface structure reconsidered , 1970 .

[12]  Sieb G. Nooteboom,et al.  Opposite effects of accentuation and deaccentuation on verification latencies for given and new information , 1987 .

[13]  Anne Cutler,et al.  Phoneme detection as a tool for comparing perception of natural and synthetic speech , 1993, Comput. Speech Lang..

[14]  Angelien Sanderman,et al.  Prosodic rules for the implementation of phrase boundaries in synthetic speech , 1996 .

[15]  Jmb Jacques Terken,et al.  Effects of segmental quality and intonation on quality judgments for texts and utterances , 1988 .

[16]  D B Pisoni,et al.  Effects of cognitive workload on speech production: acoustic analyses and perceptual consequences. , 1993, The Journal of the Acoustical Society of America.

[17]  David B. Pisoni,et al.  Perceptual evaluation of MITalk: The MIT unrestricted text-to-speech system , 1980, ICASSP.

[18]  David B. Pisoni Speeded classification of natural and synthetic speech in a lexical decision task , 1981 .

[19]  Anne Cutler,et al.  Monitoring sentence comprehension , 1979 .

[20]  James J. Jenkins,et al.  Recall of passages of synthetic speech , 1982 .

[21]  D.B. Pisoni,et al.  Perception of synthetic speech generated by rule , 1985, Proceedings of the IEEE.