Temporal Expression Classification and Normalization From Chinese Narrative Clinical Texts: Pattern Learning Approach

Background Temporal information frequently exists in the representation of the disease progress, prescription, medication, surgery progress, or discharge summary in narrative clinical text. The accurate extraction and normalization of temporal expressions can positively boost the analysis and understanding of narrative clinical texts to promote clinical research and practice. Objective The goal of the study was to propose a novel approach for extracting and normalizing temporal expressions from Chinese narrative clinical text. Methods TNorm, a rule-based and pattern learning-based approach, has been developed for automatic temporal expression extraction and normalization from unstructured Chinese clinical text data. TNorm consists of three stages: extraction, classification, and normalization. It applies a set of heuristic rules and automatically generated patterns for temporal expression identification and extraction of clinical texts. Then, it collects the features of extracted temporal expressions for temporal type prediction and classification by using machine learning algorithms. Finally, the features are combined with the rule-based and a pattern learning-based approach to normalize the extracted temporal expressions. Results The evaluation dataset is a set of narrative clinical texts in Chinese containing 1459 discharge summaries of a domestic Grade A Class 3 hospital. The results show that TNorm, combined with temporal expressions extraction and temporal types prediction, achieves a precision of 0.8491, a recall of 0.8328, and a F1 score of 0.8409 in temporal expressions normalization. Conclusions This study illustrates an automatic approach, TNorm, that extracts and normalizes temporal expression from Chinese narrative clinical texts. TNorm was evaluated on the basis of discharge summary data, and results demonstrate its effectiveness on temporal expression normalization.

[1]  Heng Weng,et al.  Temporal Expression Classification and Normalization From Chinese Narrative Clinical Texts: Pattern Learning Approach (Preprint) , 2019 .

[2]  Xiaolong Wang,et al.  Temporal indexing of medical entity in Chinese clinical notes , 2019, BMC Medical Informatics and Decision Making.

[3]  Cui Tao,et al.  Identifying direct temporal relations between time and events from clinical notes , 2018, BMC Medical Informatics and Decision Making.

[4]  Tianyong Hao,et al.  A pattern learning-based method for temporal expression extraction and normalization from multi-lingual heterogeneous clinical texts , 2018, BMC Medical Informatics and Decision Making.

[5]  Abhishek Pandey,et al.  Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review , 2017, J. Biomed. Informatics.

[6]  Nazli Goharian,et al.  GUIR at SemEval-2017 Task 12: A Framework for Cross-Domain Clinical Temporal Information Extraction , 2017, *SEMEVAL.

[7]  James Pustejovsky,et al.  SemEval-2017 Task 12: Clinical TempEval , 2017, *SEMEVAL.

[8]  Zhijun Yan,et al.  Extracting Temporal Information from Online Health Communities , 2017, ICCSE'17.

[9]  Wei Wang,et al.  A new algorithmic approach for the extraction of temporal associations from clinical narratives with an application to medical product safety surveillance reports , 2016, J. Biomed. Informatics.

[10]  James Pustejovsky,et al.  SemEval-2016 Task 12: Clinical TempEval , 2016, NAACL 2016.

[11]  Cui Tao,et al.  Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain , 2016, Comput. Methods Programs Biomed..

[12]  Anna Rumshisky,et al.  Normalization of Relative and Incomplete Temporal Expressions in Clinical Narratives , 2015, J. Am. Medical Informatics Assoc..

[13]  James Pustejovsky,et al.  SemEval-2015 Task 6: Clinical TempEval , 2015, *SEMEVAL.

[14]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[15]  Michael Gertz,et al.  Chinese Temporal Tagging with HeidelTime , 2014, EACL.

[16]  Yung-Chun Chang,et al.  TEMPTING system: A hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries , 2013, J. Biomed. Informatics.

[17]  Anna Rumshisky,et al.  Evaluating temporal relations in clinical text: 2012 i2b2 Challenge , 2013, J. Am. Medical Informatics Assoc..

[18]  G. Nenadic,et al.  Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives , 2013, J. Am. Medical Informatics Assoc..

[19]  Cui Tao,et al.  Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification , 2013, J. Am. Medical Informatics Assoc..

[20]  Tianyong Hao,et al.  Extracting and Normalizing Temporal Expressions in Clinical Data Requests from Researchers , 2013, ICSH.

[21]  C. Chute,et al.  Ontology-based time information representation of vaccine adverse events in VAERS for temporal analysis , 2012, J. Biomed. Semant..

[22]  Angel X. Chang,et al.  SUTime: A library for recognizing and normalizing time expressions , 2012, LREC.

[23]  Sanda M. Harabagiu,et al.  Automatic extraction of relations between medical concepts in clinical texts , 2011, J. Am. Medical Informatics Assoc..

[24]  XiaoJia Zhou,et al.  Temporal Expression Recognition and Temporal Relationship Extraction from Chinese Narrative Medical Records , 2011, 2011 5th International Conference on Bioinformatics and Biomedical Engineering.

[25]  Michael Gertz,et al.  HeidelTime: High Quality Rule-Based Extraction and Normalization of Temporal Expressions , 2010, *SEMEVAL.

[26]  Ian H. Witten,et al.  Data mining in bioinformatics using Weka , 2004, Bioinform..

[27]  Hongfang Liu,et al.  Clinical information extraction applications: A literature review , 2018, J. Biomed. Informatics.

[28]  Xiaolong Wang,et al.  CMedTEX: A Rule-based Temporal Expression Extraction and Normalization System for Chinese Clinical Notes , 2016, AMIA.

[29]  Yaoyun Zhang,et al.  UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes , 2016, *SEMEVAL.

[30]  Chunhua Weng,et al.  Extracting temporal constraints from clinical research eligibility criteria using conditional random fields. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[31]  Qin Lu,et al.  CTEMP: A Chinese Temporal Parser for Extracting and Normalizing Temporal Information , 2005, IJCNLP.

[32]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.