Robustly Pre-Trained Neural Model for Direct Temporal Relation Extraction

Background: Identifying relationships between clinical events and temporal expressions is a key challenge in meaningfully analyzing clinical text for use in advanced AI applications. While previous studies exist, the state-of-the-art performance has significant room for improvement. Methods: We studied several variants of BERT (Bidirectional Encoder Representations using Transformers) some involving clinical domain customization and the others involving improved architecture and/or training strategies. We evaluated these methods using a direct temporal relations dataset which is a semantically focused subset of the 2012 i2b2 temporal relations challenge dataset. Results: Our results show that RoBERTa, which employs better pre-training strategies including using 10x larger corpus, has improved overall F measure by 0.0864 absolute score (on the 1.00 scale) and thus reducing the error rate by 24% relative to the previous state-of-the-art performance achieved with an SVM (support vector machine) model. Conclusion: Modern contextual language modeling neural networks, pre-trained on a large corpus, achieve impressive performance even on highly-nuanced clinical temporal relation tasks.

[1]  Anna Rumshisky,et al.  Annotating temporal information in clinical narratives , 2013, J. Biomed. Informatics.

[2]  Jingqi Wang,et al.  Enhancing Clinical Concept Extraction with Contextual Embedding , 2019, J. Am. Medical Informatics Assoc..

[3]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[4]  Jingcheng Du,et al.  Relation Extraction from Clinical Narratives Using Pre-trained Language Models , 2019, AMIA.

[5]  Chen Lin,et al.  Multilayered temporal modeling for the clinical domain , 2016, J. Am. Medical Informatics Assoc..

[6]  Marie-Francine Moens,et al.  Structured Learning for Temporal Relation Extraction from Clinical Records , 2017, EACL.

[7]  James Pustejovsky,et al.  SemEval-2017 Task 12: Clinical TempEval , 2017, *SEMEVAL.

[8]  George Hripcsak,et al.  System Architecture for Temporal Information Extraction, Representationand Reasoning in Clinical Narrative Reports , 2005, AMIA.

[9]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[10]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[11]  Cui Tao,et al.  Identifying direct temporal relations between time and events from clinical notes , 2018, BMC Medical Informatics and Decision Making.

[12]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[13]  George Hripcsak,et al.  Research Paper: The Evaluation of a Temporal Reasoning System in Processing Clinical Discharge Summaries , 2008, J. Am. Medical Informatics Assoc..

[14]  Anna Rumshisky,et al.  Temporal reasoning over clinical text: the state of the art , 2013, J. Am. Medical Informatics Assoc..

[15]  James Pustejovsky,et al.  SemEval-2015 Task 6: Clinical TempEval , 2015, *SEMEVAL.

[16]  Graciela Gonzalez-Hernandez,et al.  Towards generating a patient's timeline: Extracting temporal relationships from clinical notes , 2013, J. Biomed. Informatics.

[17]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[18]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[19]  Siddhartha Jonnalagadda,et al.  Enhancing clinical concept extraction with distributional semantics , 2012, J. Biomed. Informatics.

[20]  Jun'ichi Tsujii,et al.  An end-to-end system to identify temporal relation in discharge summaries: 2012 i2b2 challenge , 2013, J. Am. Medical Informatics Assoc..

[21]  Hua Xu,et al.  A hybrid system for temporal information extraction from clinical text , 2013, J. Am. Medical Informatics Assoc..

[22]  James Pustejovsky,et al.  Clinical TempEval , 2014, ArXiv.

[23]  Yaoyun Zhang,et al.  UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes , 2016, *SEMEVAL.

[24]  Murthy Devarakonda,et al.  Leveraging Contextual Information in Extracting Long Distance Relations from Clinical Notes , 2019, AMIA.

[25]  Anna Rumshisky,et al.  Evaluating temporal relations in clinical text: 2012 i2b2 Challenge , 2013, J. Am. Medical Informatics Assoc..