An overview of MetaMap: historical perspective and recent advances

MetaMap is a widely available program providing access to the concepts in the unified medical language system (UMLS) Metathesaurus from biomedical text. This study reports on MetaMap's evolution over more than a decade, concentrating on those features arising out of the research needs of the biomedical informatics community both within and outside of the National Library of Medicine. Such features include the detection of author-defined acronyms/abbreviations, the ability to browse the Metathesaurus for concepts even tenuously related to input text, the detection of negation in situations in which the polarity of predications is important, word sense disambiguation (WSD), and various technical and algorithmic features. Near-term plans for MetaMap development include the incorporation of chemical name recognition and enhanced WSD.

[1]  R. Rogers,et al.  Obstructive sleep apnea. , 1996, The New England journal of medicine.

[2]  Marcelo Fiszman,et al.  Interpreting comparative constructions in biomedical text , 2007, BioNLP@ACL.

[3]  Olivier Bodenreider,et al.  Mapping data elements to terminological resources for integrating biomedical data sources , 2006, BMC Bioinformatics.

[4]  Serguei V. S. Pakhomov,et al.  Automatic Quality of Life Prediction Using Electronic Medical Records , 2008, AMIA.

[5]  G F Cooper,et al.  CHARTLINE: providing bibliographic references relevant to patient charts using the UMLS Metathesaurus Knowledge Sources. , 1992, Proceedings. Symposium on Computer Applications in Medical Care.

[6]  Thomas C. Rindflesch,et al.  Query Expansion Using the UMLS ® Metathesaurus ® , 1997 .

[7]  W. John Wilbur,et al.  Automatic MeSH term assignment and quality assessment , 2001, AMIA.

[8]  Wanda Pratt,et al.  The Effect of Feature Representation on MEDLINE Document Classification , 2005, AMIA.

[9]  L. Brooke The National Library of Medicine. , 1980, Hospital libraries.

[10]  Jianhua Li,et al.  Analysis of Polarity Information in Medical Text , 2005, AMIA.

[11]  Hj Lowe,et al.  MicroMeSH: A Microcomputer System for Searching and Exploring the National Library of Medicine's Medical Subject Headings (MeSH) Vocabulary , 1987 .

[12]  Marcelo Fiszman,et al.  Extracting Semantic Predications from Medline Citations for Pharmacogenomics , 2006, Pacific Symposium on Biocomputing.

[13]  T C Rindflesch,et al.  Ambiguity resolution while mapping free text to the UMLS Metathesaurus. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[14]  Patrick Ruch,et al.  Combining Resources to Find Answers to Biomedical Questions , 2007, TREC.

[15]  Olivier Bodenreider,et al.  The NLM Indexing Initiative , 2000, AMIA.

[16]  Halil Kilicoglu,et al.  Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment , 2006, J. Assoc. Inf. Sci. Technol..

[17]  Miguel E. Ruiz,et al.  Combining Image Features, Case Descriptions and UMLS Concepts to Improve Retrieval of Medical Images , 2006, AMIA.

[18]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[19]  Gudrun Audur Hardardottir,et al.  Linguistic Analysis: Terms and Phrases Used by Patients in E-mail Messages to Nurses , 2004, MedInfo.

[20]  Dina Demner-Fushman,et al.  Methodology for Creating UMLS Content Views Appropriate for Biomedical Natural Language Processing , 2008, AMIA.

[21]  Christian Lovis,et al.  Coping with the Variability of Medical Terms , 2004, MedInfo.

[22]  Marti A. Hearst,et al.  A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text , 2002, Pacific Symposium on Biocomputing.

[23]  A R Aronson,et al.  The effect of textual variation on concept based information retrieval. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[24]  Mary Hart,et al.  Automatic indexing using selective NLP and first-order thesauri , 1991, RIAO.

[25]  Donna K. Harman,et al.  Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[26]  P. Srinivasan Retrieval feedback in MEDLINE. , 1996, Journal of the American Medical Informatics Association : JAMIA.

[27]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[28]  Allen C. Browne,et al.  Analysis of biomedical text for chemical names: a comparison of three methods , 1999, AMIA.

[29]  Thomas C. Rindflesch,et al.  Multiple Approaches to Fine-Grained Indexing of the Biomedical Literature , 2006, Pacific Symposium on Biocomputing.

[30]  Ted Pedersen,et al.  Using UMLS Concept Unique Identifiers (CUIs) for Word Sense Disambiguation in the Biomedical Domain , 2007, AMIA.

[31]  A. Butte,et al.  Creation and implications of a phenome-genome network , 2006, Nature Biotechnology.

[32]  Alan R. Aronson,et al.  Semi-Automatic Indexing of Full Text Biomedical Articles , 2005, AMIA.

[33]  Carol Friedman,et al.  Word Sense Disambiguation via Semantic Type Classification , 2008, AMIA.

[34]  Halil Kilicoglu,et al.  Argument-predicate distance as a filter for enhancing precision in extracting predications on the genetic etiology of disease , 2006, BMC Bioinformatics.

[35]  T C Rindflesch,et al.  Semantic processing in information retrieval. , 1993, Proceedings. Symposium on Computer Applications in Medical Care.

[36]  Peter L. Elkin,et al.  UMLS Concept Indexing for Production Databases: A Feasibility Study , 2001, J. Am. Medical Informatics Assoc..

[37]  D. K. Harmon,et al.  Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[38]  Li Zhou,et al.  Concept Space Comparisons: Explorations with Five Health Domains , 2005, AMIA.

[39]  Susanne M. Humphrey,et al.  The NLM Indexing Initiative's Medical Text Indexer , 2004, MedInfo.

[40]  Marc Weeber,et al.  Developing a test collection for biomedical word sense disambiguation , 2001, AMIA.

[41]  Halil Kilicoglu,et al.  Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment , 2006 .

[42]  Alan R. Aronson,et al.  Exploiting a Large Thesaurus for Information Retrieval , 1994, RIAO.

[43]  Daniel L. Rubin,et al.  Comparison of concept recognizers for building the Open Biomedical Annotator , 2009, BMC Bioinformatics.

[44]  R A Greenes,et al.  SAPHIRE--an information retrieval system featuring concept matching, automatic indexing, probabilistic retrieval, and hierarchical relationships. , 1990, Computers and biomedical research, an international journal.

[45]  Alan R. Aronson,et al.  Application of a Medical Text Indexer to an Online Dermatology Atlas , 2004, MedInfo.

[46]  Olivier Bodenreider,et al.  From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches , 2007, BioNLP@ACL.

[47]  Allen C. Browne,et al.  UMLS knowledge for biomedical language processing. , 1993, Bulletin of the Medical Library Association.

[48]  Marc Weeber,et al.  Text-based discovery in biomedicine: the architecture of the DAD-system , 2000, AMIA.

[49]  Anderson Spickard,et al.  Research Paper: "Understanding" Medical School Curriculum Content Using KnowledgeMap , 2003, J. Am. Medical Informatics Assoc..

[50]  Alexa T. McCray,et al.  Research Paper: Evaluating the Coverage of Controlled Health Data Terminologies: Report on the Results of the NLM/AHCPR Large Scale Vocabulary Test , 1997, J. Am. Medical Informatics Assoc..