A Lightweight Approach for Extracting Disease-Symptom Relation with MetaMap toward Automated Generation of Disease Knowledge Base

Diagnostic decision support systems necessitate disease knowledge base, and this part may occupy dominant portion in the total development cost of such systems. Accordingly, toward automated generation of disease knowledge base, we conducted a preliminary study for efficient extraction of symptomatic expressions, utilizing MetaMap, a tool for assigning UMLS (Unified Medical Language System) semantic tags onto phrases in a given medical literature text. We first utilized several tags in the MetaMap output, related to symptoms and findings, for extraction of symptomatic terms. This straightforward approach resulted in Recall 82% and Precision 64%. Then, we applied a heuristics that exploits certain patterns of tag sequences that frequently appear in typical symptomatic expressions. This simple approach achieved 7% recall gain, without sacrificing precision. Although the extracted information requires manual inspection, the study suggested that the simple approach can extract symptomatic expressions, at very low cost. Failure analysis of the output was also performed to further improve the performance.

[1]  Guy Divita,et al.  Failure Analysis of MetaMap Transfer (MMTx) , 2004, MedInfo.

[2]  Dietrich Rebholz-Schuhmann,et al.  Assessment of disease named entity recognition on a corpus of annotated sentences , 2008, BMC Bioinformatics.

[3]  Katharina Kaiser,et al.  Easing semantically enriched information retrieval - An interactive semi-automatic annotation system for medical documents , 2010, Int. J. Hum. Comput. Stud..

[4]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[5]  C A Sneiderman,et al.  Finding the findings: identification of findings in medical literature using restricted natural language processing. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[6]  Wendy W. Chapman,et al.  Identifying Respiratory Findings in Emergency Department Reports for Biosurveillance using MetaMap , 2004, MedInfo.

[7]  Wanda Pratt,et al.  A Study of Biomedical Concept Identification: MetaMap vs. People , 2003, AMIA.

[8]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[9]  Peter J. Haug,et al.  Evaluation of Medical Problem Extraction from Electronic Clinical Documents Using MetaMap Transfer (MMTx) , 2005, MIE.

[10]  Randolph A Miller,et al.  Computer-assisted diagnostic decision support: history, challenges, and possible paths forward , 2009, Advances in health sciences education : theory and practice.

[11]  Michael Elhadad,et al.  CSI-OMIM - Clinical Synopsis Search in OMIM , 2011, BMC Bioinformatics.

[12]  Ricky K. Taira,et al.  A Normalized Lexical Lookup Approach to Identifying UMLS Concepts in Free Text , 2007, MedInfo.

[13]  Yves A. Lussier,et al.  Mining OMIM$^{\trade}$ for Insight into Complex Diseases , 2004, MedInfo.

[14]  Isabel Segura-Bedmar,et al.  Drug name recognition and classification in biomedical texts. A case study outlining approaches underpinning automated systems. , 2008, Drug discovery today.

[15]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[16]  Warren A Kibbe,et al.  Mining biomedical data using MetaMap Transfer (MMtx) and the Unified Medical Language System (UMLS). , 2007, Methods in molecular biology.