Finding intonational boundaries using acoustic cues related to the voice source.

Acoustic cues related to the voice source, including harmonic structure and spectral tilt, were examined for relevance to prosodic boundary detection. The measurements considered here comprise five categories: duration, pitch, harmonic structure, spectral tilt, and amplitude. Distributions of the measurements and statistical analysis show that the measurements may be used to differentiate between prosodic categories. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and boundary detection, using only the acoustic measurements described, without any lexical or syntactic information. Further investigation of the detection results shows that duration and amplitude measurements, and, to a lesser degree, pitch measurements, are useful for detecting accents, while all voice source measurements except pitch measurements are useful for boundary detection.

[1]  Janet Pierrehumbert,et al.  Gesture, Segment, Prosody: Lenition of |h| and glottal stop , 1992 .

[2]  Albert Rilliard,et al.  Acoustic Morphology of Expressive Speech: What about Contours? , 2004 .

[3]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[4]  Janet Louise Khoenle Slifka Respiratory constraints on speech production at prosodic boundaries , 2000 .

[5]  L. Larkey,et al.  Reiterant speech: an acoustic and perceptual validation. , 1983, The Journal of the Acoustical Society of America.

[6]  Yi Xu,et al.  On the Temporal Domain of Focus , 2004 .

[7]  Jeung-Yoon Choi,et al.  Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus , 2005, Speech Commun..

[8]  Stefanie Shattuck-Hufnagel,et al.  The Use of Prosody in Syntactic Disambiguation , 1991, HLT.

[9]  Mark Hasegawa-Johnson,et al.  A Maximum Likelihood Prosody Recognizer , 2004 .

[10]  M. Beckman,et al.  The articulatory kinematics of final lengthening. , 1991, The Journal of the Acoustical Society of America.

[11]  Colin W. Wightman,et al.  Segmental durations in the vicinity of prosodic phrase boundaries. , 1992, The Journal of the Acoustical Society of America.

[12]  Larry P. Heck,et al.  Modeling dynamic prosodic variation for speaker verification , 1998, ICSLP.

[13]  D. Hirst The phonology and phonetics of speech prosody: between acoustics and interpretation , 2004, Speech Prosody 2004.

[14]  Laurence White,et al.  Structural influences on accentual lengthening in English , 1999 .

[15]  Mark Hasegawa-Johnson,et al.  Acoustic Differentiation of ip and IP Boundary Levels: Comparison of L- and L-L% in the Switchboard Corpus , 2004 .