Semantic relations for problem-oriented medical records

OBJECTIVE We describe semantic relation (SR) classification on medical discharge summaries. We focus on relations targeted to the creation of problem-oriented records. Thus, we define relations that involve the medical problems of patients. METHODS AND MATERIALS We represent patients' medical problems with their diseases and symptoms. We study the relations of patients' problems with each other and with concepts that are identified as tests and treatments. We present an SR classifier that studies a corpus of patient records one sentence at a time. For all pairs of concepts that appear in a sentence, this SR classifier determines the relations between them. In doing so, the SR classifier takes advantage of surface, lexical, and syntactic features and uses these features as input to a support vector machine. We apply our SR classifier to two sets of medical discharge summaries, one obtained from the Beth Israel-Deaconess Medical Center (BIDMC), Boston, MA and the other from Partners Healthcare, Boston, MA. RESULTS On the BIDMC corpus, our SR classifier achieves micro-averaged F-measures that range from 74% to 95% on the various relation types. On the Partners corpus, the micro-averaged F-measures on the various relation types range from 68% to 91%. Our experiments show that lexical features (in particular, tokens that occur between candidate concepts, which we refer to as inter-concept tokens) are very informative for relation classification in medical discharge summaries. Using only the inter-concept tokens in the corpus, our SR classifier can recognize 84% of the relations in the BIDMC corpus and 72% of the relations in the Partners corpus. CONCLUSION These results are promising for semantic indexing of medical records. They imply that we can take advantage of lexical patterns in discharge summaries for relation classification at a sentence level.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Michael Krauthammer,et al.  GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles , 2001, ISMB.

[3]  Peter J. Haug,et al.  Bmc Medical Informatics and Decision Making Automation of a Problem List Using Natural Language Processing , 2005 .

[4]  L. Weed Medical records that guide and teach. , 1968, The New England journal of medicine.

[5]  Ronen Feldman,et al.  Mining biomedical literature using information extraction , .

[6]  Louisa Sadler,et al.  Structural Non-Correspondence in Translation , 1991, EACL.

[7]  George Hripcsak,et al.  Technical Brief: Agreement, the F-Measure, and Reliability in Information Retrieval , 2005, J. Am. Medical Informatics Assoc..

[8]  Hsinchun Chen,et al.  Genescene: biomedical text and data mining , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[9]  Hans-Peter Kriegel,et al.  Extraction of semantic biomedical relations from text using conditional random fields , 2008, BMC Bioinformatics.

[10]  Lawrence Hunter,et al.  Extracting Molecular Binding Relationships from Biomedical Text , 2000, ANLP.

[11]  Halim Fathoni,et al.  DEPARTMENT OF COMPUTER SCIENCE AND INFORMATION ENGINEERING , 2008 .

[12]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[13]  Pierre Zweigenbaum,et al.  Detecting Semantic Relations between Terms in Definitions , 2004 .

[14]  Neha Bhooshan Classification of Semantic Relations in Different Syntactic Structures in Medical Text using the MeSH Hierarchy by , 2005 .

[15]  Padmini Srinivasan,et al.  Exploring text mining from MEDLINE , 2002, AMIA.

[16]  Pierre Zweigenbaum,et al.  Indexing UMLS Semantic Types for Medical Question-Answering , 2005, MIE.

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[18]  Quoc-Chinh Bui,et al.  Extracting causal relations on HIV drug resistance from literature , 2010, BMC Bioinformatics.

[19]  Claudio Giuliano,et al.  Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature , 2006, EACL.

[20]  Ralph Grishman Proceedings of the fifth conference on Applied natural language processing , 1997 .

[21]  Peter Szolovits,et al.  Adding a Medical Lexicon to an English Parser , 2003, AMIA.

[22]  Jun Xu,et al.  Extracting biochemical interactions from MEDLINE using a link grammar parser , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[23]  Mark Craven,et al.  Learning to Extract Relations from MEDLINE , 1999 .

[24]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[25]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[26]  Parag A. Pathak,et al.  Massachusetts Institute of Technology , 1964, Nature.

[27]  Graeme Hirst,et al.  Analysis of Semantic Classes in Medical Text for Question Answering , 2004 .

[28]  Charles Sneiderman,et al.  Argument identification for arterial branching predications asserted in cardiac catheterization reports , 2000, AMIA.

[29]  Carol Friedman,et al.  Extracting Phenotypic Information from the Literature via Natural Language Processing , 2004, MedInfo.

[30]  Peter Szolovits,et al.  Syntactically-Informed Semantic Category Recognizer for Discharge Summaries , 2006, AMIA.

[31]  Angus Roberts,et al.  Mining clinical relationships from patient narratives , 2008, BMC Bioinformatics.

[32]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[33]  Barbara Rosario,et al.  Classifying Semantic Relations in Bioscience Texts , 2004, ACL.

[34]  Ralf Zimmer,et al.  RelEx - Relation extraction using dependency parse trees , 2007, Bioinform..

[35]  Nancy Chinchor,et al.  The Statistical Significance of the MUC-4 Results , 1992, MUC.

[36]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[37]  Carol E. Osborn,et al.  Statistical Applications for Health Information Management , 2000 .

[38]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[39]  Özlem Uzuner,et al.  Machine learning and rule-based approaches to assertion classification. , 2009, Journal of the American Medical Informatics Association : JAMIA.