Thai grapheme-to-phoneme using probabilistic GLR parser
暂无分享,去创建一个
Many difficulties in the Thai language such as the absence of boundary word, linking syllables in pronunciation, and homographs are challenging us in developing a Thai Grapheme-to-Phoneme (G2P) converter. Presently there are a couple Thai G2P systems which are proposed in ruled-based and decision-tree approach. The rule-based approach has a drawback in the limitation of employing the context. The decision-tree approach is somehow able to capture the local context for making the decision. On the contrary, the Probabilistic Generalized LR (PGLR) approach is reported that both the global and local context are efficiently captured in the probabilistic model. In this paper, we implement a Thai G2P system based on the PGLR approach. The result of experiment shows 90.44% of word accuracy in case of ignoring vowels length and 72.87% of word accuracy in case of exact match evaluation. These results are superior to those of rule-based and decision-tree approaches.
[1] Virach Sornlertlamvanich,et al. Probabilistic Language Modeling for Generalized LR Parsing , 1998 .
[2] Virach Sornlertlamvanich,et al. Issues in Thai Text-to-Speech Synthesis: The NECTEC Approach 1 , 2000 .
[3] Alan W. Black,et al. Statistically trained orthographic to sound models for Thai , 2000, INTERSPEECH.