论文信息 - Unsupervised Prosodic Break Detection in Mandarin Speech

Unsupervised Prosodic Break Detection in Mandarin Speech

We propose that, in Mandarin speech, an automatic prosodic break detector can be trained without any prosodically labeled training data. We use only lexical and acoustic cues to create a small labeled training set, then use semi-supervised learning to train a prosodic break detector. A generative mixture model is proposed as the learning algorithm that can learn with both labeled and unlabeled data. The experiments in both English and Mandarin corpus verify our algorithm.

Jui Ting Huang | Mark Hasegawa-Johnson | Chilin Shih

[1] Chilin Shih,et al. Duration Study for the Bell Laboratories Mandarin Text-to-Speech System , 1997 .

[2] Gina-Anne Levow,et al. Unsupervised and Semi-supervised Learning of Tone and Pitch Accent , 2006, NAACL.

[3] Mark Hasegawa-Johnson,et al. An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic-prosodic model , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] Mari Ostendorf,et al. Prediction of abstract prosodic labels for speech synthesis , 1996, Comput. Speech Lang..

[5] Shrikanth S. Narayanan,et al. Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling , 2006, INTERSPEECH.

[6] Mari Ostendorf,et al. TOBI: a standard for labeling English prosody , 1992, ICSLP.

[7] Min Chu,et al. Locating Boundaries for Prosodic Constituents in Unrestricted Mandarin Texts , 2001, Int. J. Comput. Linguistics Chin. Lang. Process..

[8] Mari Ostendorf,et al. The use of prosody in syntactic disambiguation , 1991 .

[9] Yao Qian,et al. Prosodic Word: the Lowest Constituent in the Mandarin Prosody Processing , 2002 .

[10] Fuhui Long,et al. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.