Bidirectional LSTM-CRF for Adverse Drug Event Tagging in Electronic Health Records

Adverse drug event (ADE) detection is a vital step towards effective pharmacovigilance and prevention of future incidents caused by potentially harmful ADEs. Electronic health records (EHRs) of patients in hospitals contain valuable information regarding the ADEs and hence are an important source for detecting ADE signals. We have developed a deep learning based system that utilizes a three layered deep learning architecture of 1) RNN (bidirectional long short-term memory (bi-LSTM)) for character-level word representation 2) bi-LSTM for context representation and 3) Conditional Random Fields (CRF) for the final output prediction, by integrating them into one deep network architecture. Furthermore, we have developed customized rule-based tokenization techniques for preprocessing text to deal with the noise in the EHR text. In this paper, we share our system architecture and its performance w.r.t the MADE1.0 NLP challenge.

[1]  Hong Yu,et al.  Structured prediction models for RNN based sequence labeling in clinical text , 2016, EMNLP.

[2]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[3]  Hong Yu,et al.  Bidirectional RNN for Medical Event Detection in Electronic Health Records , 2016, NAACL.

[4]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[5]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[6]  Elke A. Rundensteiner,et al.  One Size Does Not Fit All: An Ensemble Approach Towards Information Extraction from Adverse Drug Event Narratives , 2018, HEALTHINF.

[7]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[10]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[11]  Karin M. Verspoor,et al.  BioC: a minimalist approach to interoperability for biomedical text processing , 2013, AMIA.

[12]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[13]  Elena Tutubalina,et al.  Combination of Deep Recurrent Neural Networks and Conditional Random Fields for Extracting Adverse Drug Reactions from User Reviews , 2017, Journal of healthcare engineering.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..