Information Extraction from Medical Social Media

Extracting information from unstructured texts is important since automatic processing and analysis of texts requires structured information. Algorithms and tools are already available for mapping clinical and biomedical documents to concepts of medical terminologies and ontologies. Once applied to a document they provide extracted concepts that represent the content of a document. However, the question is whether these tools are applicable to medical social media. As we have seen in the previous sections, language in medical social media texts differs from language in clinical documents. In this chapter, we will assess the extraction quality of such tools through a qualitative study. The mapping quality of two mapping or named entity recognition tools originally designed for processing clinical texts is compared when they are applied to medical social media text.

[1]  Alexa T. McCray,et al.  An Upper-Level Ontology for the Biomedical Domain , 2003, Comparative and functional genomics.

[2]  Wayne H. Ward,et al.  Towards Temporal Relation Discovery from the Clinical Narrative , 2009, AMIA.

[3]  Siddhartha Jonnalagadda,et al.  Enhancing clinical concept extraction with distributional semantics , 2012, J. Biomed. Informatics.

[4]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[5]  Olivier Bodenreider,et al.  Aggregating UMLS Semantic Types for Reducing Conceptual Complexity , 2001, MedInfo.

[6]  Hamish Cunningham,et al.  GATE-a General Architecture for Text Engineering , 1996, COLING.

[7]  Clement J. McDonald,et al.  A Natural Language Processing System to Extract and Code Concepts Relating to Congestive Heart Failure from Chest Radiology Reports , 2006, AMIA.

[8]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[9]  William R. Hersh,et al.  Information Retrieval: A Health and Biomedical Perspective , 2002 .

[10]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[11]  Wendy W. Chapman,et al.  Identifying Respiratory Findings in Emergency Department Reports for Biosurveillance using MetaMap , 2004, MedInfo.

[12]  Clement J. McDonald,et al.  Extracting Structured Information from Free Text Pathology Reports , 2003, AMIA.

[13]  Syed Sibte Raza Abidi,et al.  Comparing Metamap to MGrep as a Tool for Mapping Free Text to Formal Medical Lexions , 2012, KECSM@ISWC.

[14]  Guergana K. Savova,et al.  System Evaluation on a Named Entity Corpus from Clinical Notes , 2008, LREC.

[15]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[16]  Olivier Bodenreider,et al.  From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches , 2007, BioNLP@ACL.

[17]  Ralph Grishman INFORMATION EXTRACTION AND SPEECH RECOGNITION , 1998 .

[18]  Clement J. McDonald,et al.  Automated Extraction and Normalization of Findings from Cancer-Related Free-Text Radiology Reports , 2003, AMIA.

[19]  William R. Hersh,et al.  A Survey of Current Work in Biomedical Text Mining , 2005 .

[20]  Raphaël Troncy,et al.  NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Extraction Tools , 2012, EACL.

[21]  Charles E. Kahn,et al.  Automated semantic indexing of figure captions to improve radiology image retrieval. , 2009, Journal of the American Medical Informatics Association : JAMIA.

[22]  Tong Zhang,et al.  Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[23]  Carol Friedman,et al.  Natural language processing: State of the art and prospects for significant progress, a workshop sponsored by the National Library of Medicine , 2013, J. Biomed. Informatics.