Improving Text-to-Pictograph Translation Through Word Sense Disambiguation

We describe the implementation of a Word Sense Disambiguation (WSD) tool in a Dutch Text-to-Pictograph translation system, which converts textual messages into sequences of pictographic images. The system is used in an online platform for Augmentative and Alternative Communication (AAC). In the original translation process, the appropriate sense of a word was not disambiguated before converting it into a pictograph. This often resulted in incorrect translations. The implementation of a WSD tool provides a better semantic understanding of the input messages.

[1]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Daphne Koller,et al.  Word-Sense Disambiguation for Machine Translation , 2005, HLT.

[4]  Piek T. J. M. Vossen,et al.  DutchSemCor: Targeting the ideal sense-tagged corpus , 2012, LREC.

[5]  Vincent Vandeghinste,et al.  Bridging the Gap between Pictographs and Natural Language , 2012 .

[6]  Frank Van Eynde,et al.  Extending a Dutch Text-to-Pictograph Converter to English and Spanish , 2015, SLPAT@Interspeech.

[7]  E. Maks,et al.  The Cornetto database: Semantic issues in linking lexical units and synsets , 2010 .

[8]  Storr,et al.  Blissymbols For Use , 1980 .

[9]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[10]  Frank Van Eynde,et al.  Translating text into pictographs , 2015, Natural Language Engineering.

[11]  Frank Van Eynde,et al.  Natural Language Generation from Pictographs , 2015, ENLG.

[12]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[13]  Ineke Schuurman,et al.  Linking Pictographs to Synsets: Sclera2Cornetto , 2014, LREC.

[14]  Isa Maks,et al.  Integrating Lexical Units, Synsets and Ontology in the Cornetto Database , 2008, LREC.

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.

[17]  Rada Mihalcea,et al.  Toward communicating simple sentences using pictorial representations , 2008, AMTA.

[18]  Marine Carpuat,et al.  Word Sense Disambiguation vs. Statistical Machine Translation , 2005, ACL.

[19]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Statistical Machine Translation , 2007, ACL.