Defining and relating biomedical terms: Towards a cross-language morphosemantics-based system

This paper addresses the issue of how semantic information can be automatically assigned to compound terms, i.e. both a definition and a set of semantic relations. This is particularly crucial when elaborating multilingual databases and when developing cross-language information retrieval systems. The paper shows how morphosemantics can contribute in the constitution of multilingual lexical networks in biomedical corpora. It presents a system capable of labelling terms with morphologically related words, i.e. providing them with a definition, and grouping them according to synonymy, hyponymy and proximity relations. The approach requires the interaction of three techniques: (1) a language-specific morphosemantic parser, (2) a multilingual table defining basic relations between word roots and (3) a set of language-independent rules to draw up the list of related terms. This approach has been fully implemented for French, on an about 29,000 terms biomedical lexicon, resulting to more than 3000 lexical families. A validation of the results against a manually annotated file by experts of the domain is presented, followed by a discussion of our method.

[1]  Stefan Schulz,et al.  Subword segmentation-leveling out morphological variations for medical document retrieval , 2001, AMIA.

[2]  Christian Lovis,et al.  The power and limits of a rule-based morpho-semantic parser , 1999, AMIA.

[3]  Pascale Sébillot,et al.  Applications of Computational Morphology , 2002 .

[4]  Robert H. Baud,et al.  UMLF: a Unified Medical Lexicon for French , 2005, AMIA.

[5]  Christian Lovis,et al.  Medical dictionaries for patient encoding systems: a methodology , 1998, Artif. Intell. Medicine.

[6]  C Lovis,et al.  Word segmentation processing: a way to exponentially extend medical dictionaries. , 1995, Medinfo. MEDINFO.

[7]  Paul Buitelaar,et al.  Semantic annotation for concept-based cross-language medical information retrieval , 2002, Int. J. Medical Informatics.

[8]  Stefan Schulz,et al.  Morpheme-based, cross-lingual indexing for medical document retrieval , 2000, Int. J. Medical Informatics.

[9]  Robert H. Baud,et al.  Amplification of Terminologia anatomica by French language terms using Latin terms matching algorithm: A prototype for other language , 2006, Int. J. Medical Informatics.

[10]  Fiammetta Namer Morphosémantique pour l'appariement de termes dans le vocabulaire médical : Approche multilingue , 2005 .

[11]  Fiammetta Namer,et al.  Automatiser l'analyse morpho-sémantique non affixale : le système DériF , 2003 .

[12]  Pierre Zweigenbaum,et al.  Acquiring meaning for French medical terminology: contribution of morphosemantics , 2004, MedInfo.

[13]  P. Gosling DORLAND’S ILLUSTRATED MEDICAL DICTIONARY , 2003, Australasian Chiropractic & Osteopathy.

[14]  Claudio Iacobini Distinguishing derivational prefixes from Initial Combining Forms , 1997 .

[15]  Robert H. Baud,et al.  VUMeF: Extending the French Involvement in the UMLS metathesaurus , 2003, AMIA.

[16]  Stéfan Jacques Darmoni,et al.  Doc'CISMeF : un outil de recherche internet dirigé vers l'enseignement de la médecine , 2003, Document Numérique.

[17]  Danielle Corbin Morphologie dérivationnelle et structuration du lexique , 1987 .

[18]  U Hahn,et al.  MorphoSaurus , 2005, Methods of Information in Medicine.

[19]  W. A. Newman Dorland,et al.  Dorland's Illustrated Medical Dictionary , 1974 .

[20]  Christian Lovis,et al.  Trends and pitfalls with nomenclatures and classifications in medicine , 1998, Int. J. Medical Informatics.

[21]  Robert H. Baud,et al.  Galen : a third generation terminology tool to support a multipurpose national coding system for surgical procedures , 1999, MIE.

[22]  Martin Romacker,et al.  Towards a Multilingual Morpheme Thesaurus for Medical Free-Text Retrieval , 1999, MIE.

[23]  Robert H. Baud,et al.  VumeF : Extending the French part of the UMLS , 2003 .

[24]  Fiammetta Namer Acquiring Lexical Classes in Biomedical Lexicons: a Morphosemantics-based Multilingual Approach. , 2004 .