论文信息 - Modeling intensity contours and the interaction of pitch and intensity to improve automatic prosodic event detection and classification

Modeling intensity contours and the interaction of pitch and intensity to improve automatic prosodic event detection and classification

Prosody, or the way words are spoken, carries important information to understanding a speaker's communicative intention. Many studies on automatic prosodic analysis focus on parameterizing pitch content. In this work, we extend previous pitch contour modeling features to intensity contours, and develop a set of features based on the interaction of pitch and intensity. These new features improve the state-of-the-art on all prosodic event detection and classification tasks related to automatic ToBI labeling.

Andrew Rosenberg

[1] S. Shattuck-Hufnagel,et al. Perceptual Robustness of the Tonal Center of Gravity for Contour Classification , 2009 .

[2] Julia Hirschberg,et al. Discourse Structure in Spoken Language: Studies on Speech Corpora , 1995 .

[3] Mari Ostendorf,et al. TOBI: a standard for labeling English prosody , 1992, ICSLP.

[4] Andrew Rosenberg,et al. AutoBI - a tool for automatic toBI annotation , 2010, INTERSPEECH.

[5] Steven Greenberg,et al. PROSODIC STRESS REVISITED: REASSESSING THE ROLE OF FUNDAMENTAL FREQUENCY , 2000 .

[6] Mari Ostendorf,et al. Automatic labeling of prosodic patterns , 1994, IEEE Trans. Speech Audio Process..

[7] Taniya Mishra,et al. Word Prominence Detection using Robust yet Simple Prosodic Features , 2012, INTERSPEECH.

[8] Mari Ostendorf,et al. A Hierarchical Stochastic Model for Automatic Prediction of Prosodic Boundary Location , 1994, CL.

[9] Bhuvana Ramabhadran,et al. Discriminative training and unsupervised adaptation for labeling prosodic events with limited training data , 2010, INTERSPEECH.

[10] Julia Hirschberg,et al. Detecting pitch accent using pitch-corrected energy-based predictors , 2007, INTERSPEECH.

[11] E. Zwicker. Procedure for calculating loudnesss of temporally variable sounds. , 1977, The Journal of the Acoustical Society of America.

[12] Andrew Rosenberg,et al. Classifying Skewed Data: Importance Weighting to Optimize Average Recall , 2012, INTERSPEECH.

[13] Julia Hirschberg,et al. Detecting Pitch Accents at the Word, Syllable and Vowel Level , 2009, NAACL.

[14] Xuejing Sun,et al. Pitch accent prediction using ensemble machine learning , 2002, INTERSPEECH.

[15] Yasemin Altun,et al. Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech , 2004, ACL.

[16] Paul Taylor,et al. The tilt intonation model , 1998, ICSLP.

[17] Shrikanth S. Narayanan,et al. An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..