A Machine Translation Approach for Medical Terms

We describe the task of translating clinical term descriptions from Spanish to Brazilian Portuguese. We build a statistical machine translation system (SMT) using in-domain parallel corpora and available machine learning tools. The performance of this SMT was compared with general purpose machine translation systems available online. We used different techniques to validate the result of the different systems, using reference domain terminology and the occurrence of translated descriptions in a corpus of medical scientific literature and in domain specific web pages. We also use two sets of 1000 description terms that were revised and checked by a Portuguese speaker. The performance of the SMT we built had very good preliminary results.

[1]  Karin M. Verspoor,et al.  Findings of the 2016 Conference on Machine Translation , 2016, WMT.

[2]  Lynne Bowker Computer-Aided Translation , 2014 .

[3]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[4]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[5]  Alejandro Alcaraz Sintes Computer-aided translation , 2002 .

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Oladimeji Farri,et al.  TRANSLATION OF UMLS ONTOLOGIES FROM EUROPEAN PORTUGUESE TO BRAZILIAN PORTUGUESE , 2016 .

[8]  Mário J. Silva,et al.  An ontology-based approach for SNOMED CT translation , 2015, ICBO.

[9]  Barry Haddow,et al.  Interactive Assistance to Human Translators using Statistical Machine Translation Methods , 2009, MTSUMMIT.

[10]  Mikel L. Forcada,et al.  Open-Source Portuguese-Spanish Machine Translation , 2006, PROPOR.

[11]  Lucia Specia,et al.  Fully Automatic Compilation of Portuguese-English and Portuguese-Spanish Parallel Corpora , 2011, STIL.

[12]  Gema Ramírez-Sánchez,et al.  Using the Apertium Spanish-Brazilian Portuguese machine translation system for localization , 2010, EAMT.

[13]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[14]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[15]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[16]  Alon Lavie,et al.  Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[17]  Philipp Koehn,et al.  Feature-Rich Statistical Translation of Noun Phrases , 2003, ACL.

[18]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[19]  Daniel R. Luna,et al.  Development of the Spanish version of the Systematized Nomenclature of Medicine: methodology and main issues , 2000, AMIA.

[20]  Edson José Pacheco MorphoMap: mapeamento automático de narrativas clínicas para uma terminologia médica , 2009 .

[21]  Teruko Mitamura,et al.  14. Controlled language for authoring and translation , 2003 .

[22]  Martin Boeker,et al.  Machine vs. Human Translation of SNOMED CT Terms , 2013, MedInfo.