Defining Medical Words: Transposing Morphosemantic Analysis from French to English

Medical language, as many technical languages, is rich with morphologically complex words, many of which take their roots in Greek and Latin-in which case they are called neoclassical compounds. Morphosemantic analysis can help generate definitions of such words. This paper reports work on the adaptation of a morphosemantic analyzer dedicated to French (DériF) to analyze English medical neoclassical compounds. It presents the principles of this transposition and its current performance. The analyzer was tested on a set of 1,299 compounds extracted from the WHO-ART terminology. 859 could be decomposed and defined, 675 of which successfully. An advantage of this process is that complex linguistic analyses designed for French could be successfully transferred to the analysis of English medical neoclassical compounds. Moreover, the resulting system can produce more complete analyses of English medical compounds than existing ones, including a hierarchical decomposition and semantic gloss of each word.

[1]  Christian Lovis,et al.  The power and limits of a rule-based morpho-semantic parser , 1999, AMIA.

[2]  Pierre Zweigenbaum,et al.  Acquiring meaning for French medical terminology: contribution of morphosemantics , 2004, MedInfo.

[3]  Stefan Schulz,et al.  Subword segmentation-leveling out morphological variations for medical document retrieval , 2001, AMIA.

[4]  Robert H. Baud,et al.  Predicting Lexical Relations between Biomedical Terms: towards a Multilingual Morphosemantics-based System , 2005, MIE.

[5]  S Wolff The use of morphosemantic regularities in the medical vocabulary for automatic lexical coding. , 1984, Methods of information in medicine.

[6]  C Lovis,et al.  Word segmentation processing: a way to exponentially extend medical dictionaries. , 1995, Medinfo. MEDINFO.

[7]  Allen C. Browne,et al.  Lexical methods for managing variation in biomedical terminologies. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[8]  F. Namer,et al.  Guessing the meaning of neoclassical compounds within LG : the case of pathology nouns , 2005 .

[9]  Marie-Christine Jaulent,et al.  Knowledge acquisition for computation of semantic distance between WHO-ART terms , 2006, MIE.

[10]  L M Norton,et al.  Morphosemantic Analysis of -ITIS Forms in Medical Language , 1980, Methods of Information in Medicine.

[11]  Mathias Creutz,et al.  Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.0 , 2005 .

[12]  F Grémy,et al.  Morpho-semantic analysis and translation of medical compound terms. , 1991, Methods of information in medicine.

[13]  U Hahn,et al.  MorphoSaurus , 2005, Methods of Information in Medicine.

[14]  Martin Romacker,et al.  Towards a Multilingual Morpheme Thesaurus for Medical Free-Text Retrieval , 1999, MIE.