Active Learning for Prediction of Prosodic Word Boundaries in Chinese TTS Using Maximum Entropy Markov Model

For a Chinese speech synthesis system, hierarchical prosody structure generation is a key component. The prosodic word, which is the basic prosodic unit, plays an important role for the naturalness and the intelligibility for the Chinese Text-To-Speech system. However, obtaining human annotations of prosodic word to train a supervised system has become a laborious and costly effort. To overcome this, we explore active learning techniques with the goal of reducing the amount of human-annotated data needed to attain a given level of performance. In this paper Active Maximum Entropy Markov Model(AMEMM) is used for prediction of Chinese prosodic word boundaries in unrestricted Chinese text. Experiments show that for most of the cases considered, active selection strategies for labeling prosodic word boundaries are as good as or exceed the performance of random data selection.

[1]  James Paul Gee,et al.  Performance structures: A psycholinguistic and linguistic appraisal , 1983, Cognitive Psychology.

[2]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[3]  David Yarowsky,et al.  Rule Writing or Annotation: Cost-efficient Resource Usage for Base Noun Phrase Chunking , 2000, ACL.

[4]  Jianfen Cao,et al.  Rhythm of spoken Chinese - linguistic and paralinguistic evidences - , 2000, INTERSPEECH.

[5]  Xiaolong Wang,et al.  The study of a nonstationary maximum entropy Markov model and its application on the pos-tagging task , 2007, TALIP.

[6]  Hanna M. Wallach,et al.  Efficient Training of Conditional Random Fields , 2002 .

[7]  Julia Hirschberg,et al.  Predicting Intonational Boundaries Automatically from Text: The ATIS Domain , 1991, HLT.

[8]  Min Tang,et al.  Active Learning for Statistical Natural Language Parsing , 2002, ACL.

[9]  Jun Xu,et al.  Prosodic Boundary Prediction Based on Maximum Entropy Model with Error-Driven Modification , 2006, ISCSLP.

[10]  Bo Xu,et al.  Prosodic word prediction using the lexical information , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[11]  Cao Jian-fen Prediction of Prosodic Organization Based on Grammatical Information , 2003 .

[12]  Jian Su,et al.  Multi-Criteria-based Active Learning for Named Entity Recognition , 2004, ACL.

[13]  Hu Peng,et al.  Segmenting unrestricted Chinese text into prosodic words instead of lexical words , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[14]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[15]  Martha Palmer,et al.  An Empirical Study of the Behavior of Active Learning for Word Sense Disambiguation , 2006, NAACL.

[16]  Weibin Zhu,et al.  SYNTACTIC AND LEXICAL CONSTRAINT IN PROSODIC SEGMENTATION AND GROUPING , 2002 .

[17]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[18]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[19]  Dilek Z. Hakkani-Tür,et al.  Active learning for automatic speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Manabu Sassano,et al.  An Empirical Study of Active Learning with Support Vector Machines for Japanese Word Segmentation , 2002, ACL.

[21]  Paul Taylor,et al.  Assigning phrase breaks from part-of-speech sequences , 1997, Comput. Speech Lang..

[22]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[23]  Bo Xu,et al.  Prosodic Word Prediction Using a Maximum Entropy Approach , 2006, ISCSLP.

[24]  Lianhong Cai,et al.  Learning Rules for Chinese Prosodic Phrase Prediction , 2002, SIGHAN@COLING.

[25]  Cai Lianhong Statistical model based on probability frequency for Mandarin prosodic structure prediction , 2006 .

[26]  Jingbo Zhu,et al.  Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem , 2007, EMNLP.

[27]  Raymond J. Mooney,et al.  Active Learning for Natural Language Parsing and Information Extraction , 1999, ICML.

[28]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[29]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[30]  Nagiza F. Samatova,et al.  Multi-Criterion Active Learning in Conditional Random Fields , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[31]  Ziping Zhao,et al.  Prediction of Prosodic Word Boundaries in Chinese TTS Based on Maximum Entropy Markov Model and Transformation Based Learning , 2012, 2012 Eighth International Conference on Computational Intelligence and Security.