Transfer-based statistical translation of Taiwanese sign language using PCFG

This article presents a transfer-based statistical model for Chinese to Taiwanese sign-language (TSL) translation. Two sets of probabilistic context-free grammars (PCFGs) are derived from a Chinese Treebank and a bilingual parallel corpus. In this approach, a three-stage translation model is proposed. First, the input Chinese sentence is parsed into possible phrase structure trees (PSTs) based on the Chinese PCFGs. Second, the Chinese PSTs are then transferred into TSL PSTs according to the transfer probabilities between the context-free grammar (CFG) rules of Chinese and TSL derived from the bilingual parallel corpus. Finally, the TSL PSTs are used to generate the possible translation results. The Viterbi algorithm is adopted to obtain the best translation result via the three-stage translation. For evaluation, three objective evaluation metrics including AER, Top-N, and BLUE and one subjective evaluation metric using MOS were used. Experimental results show that the proposed approach outperforms the IBM Model 3 in the task of Chinese to sign-language translation.

[1]  I. Dan Melamed,et al.  Statistical Machine Translation by Parsing , 2004, ACL.

[2]  Chung-Hsien Wu,et al.  Joint Optimization of Word Alignment and Epenthesis Generation for Chinese to Taiwanese Sign Synthesis , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Daniel Gildea Dependencies vs. Constituents for Tree-Based Alignment , 2004, EMNLP.

[4]  Srinivas Bangalore,et al.  Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[5]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[6]  Ian W. Marshall,et al.  Sign language generation using HPSG. , 2002, TMI.

[7]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[8]  Chung-Hsien Wu,et al.  Joint Optimization of Word Alignment and Epenthesis Generation for Chinese to Taiwanese Sign Synthesis , 2007 .

[9]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[10]  Kevin Knight,et al.  Training Tree Transducers , 2004, NAACL.

[11]  Wu Chou,et al.  Pattern Recognition in Speech and Language Processing , 2002 .

[12]  Chung-Hsien Wu,et al.  Speech act modeling in a spoken dialog system using a fuzzy fragment-class Markov model , 2002, Speech Commun..

[13]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[14]  Matt Huenerfauth,et al.  A Survey and Critique of American Sign Language Natural Language Generation and Machine Translation Systems , 2003 .

[15]  Hermann Ney,et al.  A Comparative Study on Reordering Constraints in Statistical Machine Translation , 2003, ACL.

[16]  J. Baker Trainable grammars for speech recognition , 1979 .

[17]  Catherine N. Ball,et al.  Representation of american sign language for machine translation , 2002 .

[18]  Nina Suszczańska TRANSLATING POLISH TEXTS INTO SIGN LANGUAGE IN THE TGT SYSTEM , 2002 .

[19]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[20]  WuChung-Hsien,et al.  Speech act modeling in a spoken dialog system using a fuzzy fragment-class Markov model , 2002 .

[21]  Andy Way,et al.  An Example-Based Approach to Translating Sign Language , 2005, MTSUMMIT.

[22]  Robert D. Rodman,et al.  An Introduction to Language , 1984 .

[23]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[24]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[25]  Hermann Ney,et al.  Statistical Sign Language Translation , 2004 .

[26]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.