Cognate Mapping - A Heuristic Strategy for the Semi-Supervised Acquisition of a Spanish Lexicon from a Portuguese Seed Lexicon

We deal with the automated acquisition of a Spanish medical subword lexicon from an already existing Portuguese seed lexicon. Using two non-parallel monolingual corpora we determined Spanish lexeme candidates from Portuguese seed lexicon entries by heuristic cognate mapping. We validated the emergent lexical translation hypotheses by determining the similarity of fixed-window context vectors on the basis of Portuguese and Spanish text corpora.

[1]  Miguel E. Ruiz,et al.  CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation , 1999, TREC.

[2]  Stefan Schulz,et al.  Crossing Languages in Text Retrieval via an Interlingua , 2004, RIAO.

[3]  Davide Turcato Automatically Creating Bilingual Lexicons for Machine Translation from Bilingual Text , 1998, COLING-ACL.

[4]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[5]  Takehito Utsuro,et al.  Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora , 2003, EACL.

[6]  Stefan Schulz,et al.  Morpheme-based, cross-lingual indexing for medical document retrieval , 2000, Int. J. Medical Informatics.

[7]  B. MacWhinney Language-specific prediction in foreign language learning , 1995 .

[8]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[9]  Reinhard Rapp,et al.  Automatic Identification of Word Translations from Unrelated English and German Corpora , 1999, ACL.

[10]  Kalervo Järvelin,et al.  Translating cross-lingual spelling variants using transformation rules , 2005, Inf. Process. Manag..

[11]  Stefan Schulz,et al.  Biomedical text retrieval in languages with a complex morphology , 2002, ACL Workshop on Natural Language Processing in the Biomedical Domain.

[12]  Sergei Nirenburg,et al.  A Statistical Approach to Machine Translation , 2003 .

[13]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[14]  Kalervo Järvelin,et al.  Fuzzy translation of cross-lingual spelling variants , 2003, SIGIR.

[15]  M. Felisa Verdejo,et al.  Using Eurowordnet in a Concept-Based Approach to Cross-Language Text Retrieval , 1999, Appl. Artif. Intell..

[16]  Stefan Schulz,et al.  Cross-language MeSH Indexing using Morpho-Semantic Normalization , 2003, AMIA.

[17]  Philipp Koehn,et al.  Learning a Translation Lexicon from Monolingual Corpora , 2002, ACL 2002.