Vietnamese-Thai machine translation using rule-based

This paper presents a rule-based Vietnamese-Thai machine translation (VTMT) system. Vietnamese text is input and segmented to a set of sentences through the use of punctuation marks. The output sentence is segmented into a sequence of words using a longest syllable matching algorithm together with named entity recognition (NER) rules. Segmented words will seek corresponding Thai words from the Vietnamese-Thai lexicon. Vietnamese to Thai transcription rules will be used to transcribe unknown words and recognized name entity words. The system will analyze the source sentence structure in order to generate the output Thai sentences. The translation accuracy obtained from the proposed system was 77.15%; which is better than the results achieved through the popular website Google Translate.

[1]  Pusadee Seresangtakul,et al.  An approach to Lao-English rule based machine translation , 2015, 2015 7th International Conference on Knowledge and Smart Technology (KST).

[2]  Dinh Dien,et al.  An Approach to Word Sense Disambiguation in English-Vietnamese-English Statistical Machine Translation , 2012, 2012 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future.

[3]  Hoang Thi My Le,et al.  Building a Vietnamese-Ede Machine Translation Based on the Bilingual Corpus , 2015 .

[4]  Marta R. Costa-jussà,et al.  Study and Comparison of Rule-Based and Statistical Catalan-Spanish Machine Translation Systems , 2012, Comput. Informatics.

[5]  Remya Rajan,et al.  Rule Based Machine Translation from English to Malayalam , 2009, 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies.

[6]  Thepchai Supnithi,et al.  English-Thai Example-Based Machine Translation using n-gram model , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[7]  Hai Zhao,et al.  Vietnamese to Chinese Machine Translation via Chinese Character as Pivot , 2013, PACLIC.

[8]  Francisco Casacuberta,et al.  A Quantitative Method for Machine Translation Evaluation , 2003 .

[9]  Antonio Toral,et al.  An Italian to Catalan RBMT system reusing data from existing language pairs , 2011, FREEOPMT.