Translating Unknown Words by Analogical Learning

Unknown words are a well-known hindrance to natural language applications. In particular, they drastically impact machine translation quality. An easy way out commercial translation systems usually offer their users is the possibility to add unknown words and their translations into a dedicated lexicon. Recently, Stroppa and Yvon (2005) have shown how analogical learning alone deals nicely with morphology in different languages. In this study we show that analogical learning offers as well an elegant and effective solution to the problem of identifying potential translations of unknown words.

[1]  Philippe Langlais,et al.  Mood at work: Ramses versus Pharaoh , 2006, WMT@HLT-NAACL.

[2]  Yves Lepage,et al.  De l'analogie rendant compte de la commutation en linguistique , 2003 .

[3]  Philipp Koehn,et al.  Learning a Translation Lexicon from Monolingual Corpora , 2002, ACL 2002.

[4]  François Yvon,et al.  An Analogical Learner for Morphological Analysis , 2005, CoNLL.

[5]  Yves Lepage,et al.  ALEPH: an EBMT system based on the preservation of proportional analogies between sentences across languages , 2005, IWSLT.

[6]  Philipp Koehn,et al.  Manual and Automatic Evaluation of Machine Translation between European Languages , 2006, WMT@HLT-NAACL.

[7]  Yaser Al-Onaizan,et al.  Translating Named Entities Using Monolingual and Bilingual Resources , 2002, ACL.

[8]  Reinhard Rapp,et al.  Automatic Identification of Word Translations from Unrelated English and German Corpora , 1999, ACL.

[9]  Yves Lepage,et al.  Solving Analogies on Words: An Algorithm , 1998, COLING-ACL.

[10]  Chiori Hori,et al.  Overview of the IWSLT 2005 Evaluation Campaign , 2005, IWSLT.

[11]  Sonja Nießen Improving statistical machine translation using morpho-syntactic information , 2002 .

[12]  Vincent Claveau,et al.  Automatic Morphological Query Expansion Using Analogy-Based Machine Learning , 2007, ECIR.

[13]  K. Holyoak,et al.  The analogical mind. , 1997 .

[14]  Philipp Koehn,et al.  Improved Statistical Machine Translation Using Paraphrases , 2006, NAACL.

[15]  Takaaki TANAKA,et al.  Extraction of translation equivalents from non-parallel corpora , 1999, TMI.

[16]  Sharon Goldwater,et al.  Improving Statistical MT through Morphological Analysis , 2005, HLT.

[17]  K. Holyoak,et al.  The analogical mind. , 1997, The American psychologist.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[20]  Dayne Freitag,et al.  Morphology Induction from Term Clusters , 2005, CoNLL.

[21]  Pascale Fung,et al.  An IR Approach for Translating New Words from Nonparallel, Comparable Texts , 1998, ACL.

[22]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[23]  Philipp Koehn,et al.  Empirical Methods for Compound Splitting , 2003, EACL.

[24]  Hermann Ney,et al.  Towards the Use of Word Stems and Suffixes for Statistical Machine Translation , 2004, LREC.

[25]  Young-Suk Lee,et al.  Morphological Analysis for Statistical Machine Translation , 2004, NAACL.

[26]  Peter D. Turney Similarity of Semantic Relations , 2006, CL.

[27]  Hsin-Hsi Chen,et al.  Proper Name Translation in Cross-Language Information Retrieval , 1998, COLING-ACL.