论文信息 - Morphological Analysis Based Part-of-Speech Tagging for Uyghur Speech Synthesis

Morphological Analysis Based Part-of-Speech Tagging for Uyghur Speech Synthesis

Accuracy of part-of-speech tagging is critical to downstream sub-tasks in front-end text analysis model of text-to-speech System. Uyghuris an agglutinative language in which numbers of words are formed by suffixes attaching to a stem (or root). Owing to there are unlimited new formed and derived syntactic words in Uyghur, Sizes of part-of-speech tagging set were big and out-of-vocabulary words often occurred in conventional Uyghur part-of-speech tagging method which directly trained and predicted the part-of-speech of word. To address this problem, this paper proposes the idea that trains the part-of-speech of stem and predicts the part-of-speech of word mainly by stem. Bi-gram language model is used to segment the stem and affix boundary of word, hidden markov model is used to train and predict part-of-speech of stem. In the end, rule adjusting method is used to adjust the changed part-of-speech of word when suffix attaching to a stem. Experimental result shows that proposed method obviously reduces the part-of-speech tagging error rate comparing to conventional part-of-speech tagging method.

Askar Hamdulla | Askar Rozi | Gulnar Ali | Guljamal Mamateli

[1] Tatsuya Kawahara,et al. Uyghur morpheme-based language models and ASR , 2010, IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS.

[2] Jerome R. Bellegarda,et al. Improved pos tagging for text-to-speech synthesis , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3] James H. Martin,et al. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[4] Eric Brill,et al. Deducing linguistic structure from the statistics of large corpora , 1990 .

[5] Dale Schuurmans,et al. A Hierarchical EM Approach to Word Segmentation , 2001, NLPRS.

[6] Bernard Mérialdo,et al. Tagging English Text with a Probabilistic Model , 1994, CL.

[7] Yoram Singer,et al. The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.