Building Specialized Bilingual Lexicons Using Word Sense Disambiguation

This paper presents an extension of the standard approach used for bilingual lexicon extraction from comparable corpora. We study the ambiguity problem revealed by the seed bilingual dictionary used to translate context vectors and augment the standard approach by a Word Sense Disambiguation process. Our aim is to identify the translations of words that are more likely to give the best representation of words in the target language. On two specialized French-English and RomanianEnglish comparable corpora, empirical experimental results show that the proposed method consistently outperforms the standard approach.

[1]  Georges Linarès,et al.  A Multi-view Approach for Term Translation Spotting , 2011, CICLing.

[2]  Pankoo Kim,et al.  A method for enhancing image retrieval based on annotation using modified WUP similarity in WordNet , 2012 .

[3]  Emmanuel Morin,et al.  Adaptive Dictionary for Bilingual Lexicon Extraction from Comparable Corpora , 2012, LREC.

[4]  Pierre Zweigenbaum,et al.  The Effect of a General Lexicon in Corpus-Based Identification of French-English Medical Word Translations , 2003, MIE.

[5]  Jungpil Shin,et al.  Efficient Image Retrieval Using Conceptualization of Annotated Images , 2007, MCAM.

[6]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[7]  Éric Gaussier,et al.  Improving Corpus Comparability for Bilingual Lexicon Extraction from Comparable Corpora , 2010, COLING.

[8]  Jean-Michel Renders,et al.  A Geometric View on Bilingual Lexicon Extraction from Comparable Corpora , 2004, ACL.

[9]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[10]  Philippe Langlais,et al.  Revisiting Context-based Projection Methods for Term-Translation Spotting in Comparable Corpora , 2010, COLING.

[11]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[12]  Pascale Fung,et al.  A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora , 1998, AMTA.

[13]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[14]  Pierre Zweigenbaum,et al.  Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora , 2002, COLING.

[15]  Chang Choi,et al.  Automatic Enrichment of Semantic Relation Network and Its Application to Word Sense Disambiguation , 2011, IEEE Transactions on Knowledge and Data Engineering.

[16]  Reinhard Rapp,et al.  Identifying Word Translations in Non-Parallel Texts , 1995, ACL.

[17]  Kyo Kageura,et al.  Anchor Points for Bilingual Lexicon Extraction from Small Comparable Corpora , 2009, MTSUMMIT.