Research Paper: "Understanding" Medical School Curriculum Content Using KnowledgeMap

OBJECTIVE To describe the development and evaluation of computational tools to identify concepts within medical curricular documents, using information derived from the National Library of Medicine's Unified Medical Language System (UMLS). The long-term goal of the KnowledgeMap (KM) project is to provide faculty and students with an improved ability to develop, review, and integrate components of the medical school curriculum. DESIGN The KM concept identifier uses lexical resources partially derived from the UMLS (SPECIALIST lexicon and Metathesaurus), heuristic language processing techniques, and an empirical scoring algorithm. KM differentiates among potentially matching Metathesaurus concepts within a source document. The authors manually identified important "gold standard" biomedical concepts within selected medical school full-content lecture documents and used these documents to compare KM concept recognition with that of a known state-of-the-art "standard"-the National Library of Medicine's MetaMap program. MEASUREMENTS The number of "gold standard" concepts in each lecture document identified by either KM or MetaMap, and the cause of each failure or relative success in a random subset of documents. RESULTS For 4,281 "gold standard" concepts, MetaMap matched 78% and KM 82%. Precision for "gold standard" concepts was 85% for MetaMap and 89% for KM. The heuristics of KM accurately matched acronyms, concepts underspecified in the document, and ambiguous matches. The most frequent cause of matching failures was absence of target concepts from the UMLS Metathesaurus. CONCLUSION The prototypic KM system provided an encouraging rate of concept extraction for representative medical curricular texts. Future versions of KM should be evaluated for their ability to allow administrators, lecturers, and students to navigate through the medical curriculum to locate redundancies, find interrelated information, and identify omissions. In addition, the ability of KM to meet specific, personal information needs should be assessed.

[1]  P. Gove Webster's Third New International Dictionary , 1986 .

[2]  James J. Cimino,et al.  Automated knowledge extraction from the UMLS , 1998, AMIA.

[3]  C Bean,et al.  A comprehensive strategy for designing a Web-based medical curriculum. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[4]  Steven S. Dimse,et al.  Cataloging a Medical Curriculum Using MeSH Keywords , 1988 .

[5]  Charles Sneiderman,et al.  Identification of anatomical terminology in medical text , 1998, AMIA.

[6]  G F Cooper,et al.  CHARTLINE: providing bibliographic references relevant to patient charts using the UMLS Metathesaurus Knowledge Sources. , 1992, Proceedings. Symposium on Computer Applications in Medical Care.

[7]  Chris Candler,et al.  An Analysis of Web-Based Instruction in a Neurosciences Course , 1998 .

[8]  Donna Harman,et al.  The Second Text Retrieval Conference (TREC-2) , 1995, Inf. Process. Manag..

[9]  C P Friedman,et al.  Issues and challenges in the design of curriculum information systems. , 1995, Academic medicine : journal of the Association of American Medical Colleges.

[10]  Marc Weeber,et al.  Evaluating MetaMap's Text-to-Concept Mapping Performance , 1999, AMIA.

[11]  Hsinchun Chen,et al.  Filling Preposition-Based Templates to Capture Information from Medical Abstracts , 2001, Pacific Symposium on Biocomputing.

[12]  J Halama,et al.  Evaluation of Web‐based Computer‐aided Instruction in a Basic Science Course , 2000, Academic medicine : journal of the Association of American Medical Colleges.

[13]  Randolph A. Miller,et al.  Research Paper: An Experiment Comparing Lexical and Statistical Methods for Extracting MeSH Terms from Clinical Free Text , 1998, J. Am. Medical Informatics Assoc..

[14]  C U Lehmann,et al.  Active Learning Centre: design and evaluation of an educational World Wide Web site , 2000, Medical informatics and the Internet in medicine.

[15]  Kanter Sl Using the UMLS to represent medical curriculum content. , 1993 .

[16]  A L Rector,et al.  Goals for concept representation in the GALEN project. , 1993, Proceedings. Symposium on Computer Applications in Medical Care.

[17]  George Hripcsak,et al.  Mapping abbreviations to full forms in biomedical articles. , 2002, Journal of the American Medical Informatics Association : JAMIA.

[18]  C A Smith,et al.  Automated Semantic Indexing of Imaging Reports to Support Retrieval of Medical Images in the Multimedia Electronic Medical Record , 1999, Methods of Information in Medicine.

[19]  Hongfang Liu,et al.  A study of abbreviations in the UMLS , 2001, AMIA.

[20]  T C Rindflesch,et al.  Ambiguity resolution while mapping free text to the UMLS Metathesaurus. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[21]  Stephen B. Johnson Research Paper: A Semantic Lexicon for Medical Language Processing , 1999, J. Am. Medical Informatics Assoc..

[22]  S L Kanter Using the UMLS to represent medical curriculum content. , 1993, Proceedings. Symposium on Computer Applications in Medical Care.

[23]  Olivier Bodenreider,et al.  Evaluating UMLS strings for natural language processing , 2001, AMIA.

[24]  S. Johnson A semantic lexicon for medical language processing. , 1999, Journal of the American Medical Informatics Association : JAMIA.

[25]  Peter L. Elkin,et al.  UMLS Concept Indexing for Production Databases: A Feasibility Study , 2001, J. Am. Medical Informatics Assoc..

[26]  David A. Campbell,et al.  A technique for semantic classification of unknown words using UMLS resources , 1999, AMIA.

[27]  C P Friedman,et al.  Computer databases of medical school curricula , 1992, Academic medicine : journal of the Association of American Medical Colleges.

[28]  Alan R. Aronson,et al.  Exploiting a Large Thesaurus for Information Retrieval , 1994, RIAO.

[29]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[30]  Mary S. Brisley “Continued” , 1924 .

[31]  Carol Friedman,et al.  Automating SNOMED coding using medical language understanding: a feasibility study , 2001, AMIA.

[32]  R A Miller,et al.  Using POSTDOC to recognize biomedical concepts in medical school curricular documents. , 1994, Bulletin of the Medical Library Association.

[33]  Tomek Strzalkowski,et al.  Recent Developments in Natural Language Text Retrieval , 1993, TREC.