A Textual Representation Scheme for Identifying Clinical Relationships in Patient Records

The identification of relationships between clinical concepts in patient records is a preliminary step for many important applications in medical informatics, ranging from quality of care to hypothesis generation. In this work we describe an approach that facilitates the automatic recognition of relationships defined between two different concepts in text. Unlike the traditional bag-of-words representation, in this work, a relationship is represented with a scheme of five distinct context-blocks based on the position of concepts in the text. This scheme was applied to eight different relationships, between medical problems, treatments and tests, on a set of 349 patient records from the 4th i2b2 challenge. Results show that the context-block representation was very successful (F-Measure = 0.775) compared to the bag-of-words model (F-Measure = 0.402). The advantage of this representation scheme was the correct management of word position information, which may be critical in identifying certain relationships.

[1]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[2]  Hagit Shatkay,et al.  Multi-dimensional classification of biomedical text: Toward automated, practical provision of high-utility text to diverse users , 2008, Bioinform..

[3]  Hans-Peter Kriegel,et al.  Extraction of semantic biomedical relations from text using conditional random fields , 2008, BMC Bioinformatics.

[4]  Yuan Luo,et al.  Identifying patient smoking status from medical discharge records. , 2008, Journal of the American Medical Informatics Association : JAMIA.

[5]  A. Valencia,et al.  Overview of the protein-protein interaction annotation extraction task of BioCreative II , 2008, Genome Biology.

[6]  Christopher G. Chute,et al.  Maximum entropy modeling for mining patient medication status from free text , 2002, AMIA.

[7]  Jari Björne,et al.  Complex event extraction at PubMed scale , 2010, Bioinform..

[8]  Halil Kilicoglu,et al.  Medical Facts to Support Inferencing in Natural Language Processing , 2005, AMIA.

[9]  Xiaoyan Wang,et al.  Selecting information in electronic health records for knowledge acquisition , 2010, J. Biomed. Informatics.

[10]  George Hripcsak,et al.  Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study. , 2008, Journal of the American Medical Informatics Association : JAMIA.

[11]  William R. Hersh,et al.  A Survey of Current Work in Biomedical Text Mining , 2005 .

[12]  Xiaoyan Wang,et al.  Automated Knowledge Acquisition from Clinical Narrative Reports , 2008, AMIA.

[13]  Mark Craven,et al.  Learning to Extract Relations from MEDLINE , 1999 .

[14]  A Valencia,et al.  An Overview of BioCreative II.5 , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.