Duration and intensity as perceptual cues for naïve listeners' prominence and boundary perception

I investigate the acoustic correlates of prosodic prominence and boundary, as they are perceived by naive listeners, in spontaneous speech from American English (Buckeye corpus). Prosodic prominence and phrasing serve different functions in speech communication: prosodic phrase boundaries demarcate speech chunks that typically cohere semantically, while prominences encode focus and possibly also rhythmic structure. The acoustic correlates of prominence and phrase boundary are examined through measures of vowel duration and overall intensity of stressed vowels, to see how those measures correlate, individually or in combination, with naive listeners’ perception of prominence and boundary. The results show that most stressed vowels are lengthened in preboundary words (i.e., those final in the prosodic phrase). Prosodic prominence is also cued by increased duration, but in combination with higher overall intensity for some vowels. These acoustic differences associated with perceived prominence and boundary suggest different mechanisms underlying their production. This claim finds support from consideration of the different functions that prominence and boundary play in encoding information structure, and in speech production planning.

[1]  Vincent J. van Heuven,et al.  Acoustic correlates of linguistic stress and accent in Dutch and American English , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  W. Labov,et al.  The Atlas Of North American English , 2005 .

[3]  Fabio Tamburini,et al.  Prosodic prominence detection in speech , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[4]  Steven Greenberg,et al.  PROSODIC STRESS REVISITED: REASSESSING THE ROLE OF FUNDAMENTAL FREQUENCY , 2000 .

[5]  L.F.M. ten Bosch,et al.  Prominence in read aloud sentences, as marked by listeners and classified automatically , 1997 .

[6]  W. Cooper,et al.  Acoustical aspects of contrastive stress in question-answer contexts. , 1985, The Journal of the Acoustical Society of America.

[7]  Mattias Heldner,et al.  On the reliability of overall intensity and spectral emphasis as acoustic correlates of focal accents in Swedish , 2003, J. Phonetics.

[8]  B. Rosner,et al.  Loudness predicts prominence: fundamental frequency lends little. , 2005, The Journal of the Acoustical Society of America.

[9]  J. Terken Fundamental frequency and perceived prominence of accented syllables. , 1991, The Journal of the Acoustical Society of America.

[10]  Mark Hasegawa-Johnson,et al.  Acoustic differentiation of L- and L-L% in switchboard and radio news speech , 2006 .

[11]  Jeung-Yoon Choi,et al.  Finding intonational boundaries using acoustic cues related to the voice source. , 2005, The Journal of the Acoustical Society of America.

[12]  Mark Hasegawa-Johnson,et al.  Acoustic Differentiation of ip and IP Boundary Levels: Comparison of L- and L-L% in the Switchboard Corpus , 2004 .

[13]  Carlos Gussenhoven,et al.  Fundamental frequency declination in Dutch: testing three hy-potheses , 1988 .

[14]  Colin W. Wightman,et al.  Segmental durations in the vicinity of prosodic phrase boundaries. , 1992, The Journal of the Acoustical Society of America.

[15]  J. Sawusch,et al.  The processing of duration and intensity cues to prominence. , 1996, The Journal of the Acoustical Society of America.