A Deep Learning Based Named Entity Recognition Approach for Adverse Drug Events Identification and Extraction in Health Social Media

Drug safety surveillance plays a significant role in supporting medication decision-making by both healthcare providers and patients. Extracting adverse drug events (ADEs) from social media provides a promising direction to addressing this challenging task. Prior studies typically perform lexicon-based extraction using existing dictionaries or medical lexicons. While those approaches can capture ADEs and identify risky drugs from patient social media postings, they often fail to detect those ADEs whose descriptive words do not exist in medical lexicons and dictionaries. In addition, their performance is inferior when ADE related social media content is expressed in an ambiguous manner. In this research, we propose a research framework using advanced natural language processing and deep learning for high-performance ADE extraction. The framework consists of training the word embeddings using a large medical domain corpus to capture precise semantic and syntactic word relationships, and a deep learning based named entity recognition method for drug and ADE entity identification and prediction. Experimental results show that our framework significantly outperforms existing models when extracting ADEs from social media in different test beds.

[1]  A Bate,et al.  Decision support methods for the detection of adverse events in post-marketing data. , 2009, Drug discovery today.

[2]  P Ryan,et al.  Novel Data‐Mining Methodologies for Adverse Drug Event Discovery and Analysis , 2012, Clinical pharmacology and therapeutics.

[3]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[5]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[6]  Fan Yu,et al.  Towards large-scale twitter mining for drug-related adverse events , 2012, SHB '12.

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Jian Yang,et al.  Towards Internet-Age Pharmacovigilance: Extracting Adverse Drug Reactions from User Posts in Health-Related Social Networks , 2010, BioNLP@ACL.

[9]  A. Bate,et al.  Quantitative signal detection using spontaneous ADR reporting , 2009, Pharmacoepidemiology and drug safety.

[10]  Hsinchun Chen,et al.  A research framework for pharmacovigilance in health social media: Identification and evaluation of patient adverse drug event reports , 2015, J. Biomed. Informatics.

[11]  Azadeh Nikfarjam,et al.  Pattern mining for extraction of mentions of Adverse Drug Reactions from user comments. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[12]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[13]  Ethan Basch,et al.  The missing voice of patients in drug-safety reporting. , 2010, The New England journal of medicine.

[14]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[15]  S J Stanhope,et al.  Exploiting Online Discussions to Discover Unrecognized Drug Side Effects , 2013, Methods of Information in Medicine.

[16]  Lyle H. Ungar,et al.  Identifying potential adverse effects using the web: A new approach to medical hypothesis generation , 2011, J. Biomed. Informatics.