Pitch accent versus lexical stress: quantifying acoustic measures related to the voice source

In this paper, we explore acoustic correlates of pitch accent and main lexical stress in American English, and the interaction of these cues with other factors that affect prosody. In a controlled study, we varied presence or absence and type of pitch accent (L ∗ vs H ∗ ), boundary-related tone sequence (L-L% vs. HH%) and gender of the talker, for the sentence “Dagada gave Bobby doodads”. The measures were duration, F0 (fundamental frequency), H ∗ 1 −H ∗ 2 (related to open quotient), and H ∗ 1 −A ∗ (related to spectral tilt). Contour approximations were used to analyze time-course movements of these measures. For “Dagada” we found that, consistent with earlier literature, a) H ∗ and L ∗ pitch accents showed different F0 contours, b) pitchaccented syllables were longer than unaccented ones, c) stressed “ga” syllables had lower H ∗ 1 − H ∗ 2 values than surrounding unstressed syllables, and for male talkers, lower H ∗ 1 − A ∗ values, indicating lesser spectral tilt. Unexpectedly, F0 maxima associated with an H ∗ accent occurred most of the time later in the accented syllable than F0 minima associated with L ∗ . The cues to lexical stress were consistent with or without pitch accent (e.g. lower H ∗ − H ∗ 2 ), but they sometimes interacted with gender and/or boundary tones: for example, lower H ∗ − A ∗ in stressed “ga” syllables was only found for female talkers in unaccented cases, and some cues of both accent and stress were less pronounced in the final word “doodads”, which also carried boundary-related tones. Index Terms: voice source, prosody, voice quality

[1]  Mark F Medress,et al.  Acoustic Correlates of Word Stress , 1972 .

[2]  Jeung-Yoon Choi,et al.  Finding intonational boundaries using acoustic cues related to the voice source. , 2005, The Journal of the Acoustical Society of America.

[3]  J. Perkell,et al.  Comparisons among aerodynamic, electroglottographic, and acoustic spectral measures of female voice. , 1995, Journal of speech and hearing research.

[4]  Yi Xu,et al.  Contextual tonal variation in Mandarin Chinese , 1993 .

[5]  Gunnar Fant,et al.  The voice source in connected speech , 1997, Speech Commun..

[6]  Abeer Alwan,et al.  An improved correction formula for the estimation of harmonic magnitudes and its application to open quotient estimation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Laurence White,et al.  Structural influences on accentual lengthening in English , 1999 .

[8]  Abeer Alwan,et al.  Voice source correlates of prosodic features in american English: a pilot study , 2006, INTERSPEECH.

[9]  Agaath M. C. Sluijter,et al.  Spectral balance as an acoustic correlate of linguistic stress. , 1996, The Journal of the Acoustical Society of America.

[10]  B. Rosner,et al.  Loudness predicts prominence: fundamental frequency lends little. , 2005, The Journal of the Acoustical Society of America.

[11]  E. Chuang,et al.  Glottal characteristics of male speakers: acoustic correlates and comparison with female data. , 1996, The Journal of the Acoustical Society of America.

[12]  Roy D. Patterson,et al.  An instantaneous-frequency-based pitch extraction method for high-quality speech transformation: revised TEMPO in the STRAIGHT-suite , 1998, ICSLP.

[13]  Mari Ostendorf,et al.  TOBI: a standard for labeling English prosody , 1992, ICSLP.

[14]  K. Stevens,et al.  Glottal characteristics of female speakers , 1995 .