Adverse drug reaction detection on social media with deep linguistic features

Adverse reactions caused by drugs are one of the most important public health problems. Social media has encouraged more patients to share their drug use experiences and has become a major source for the detection of professionally unreported adverse drug reactions (ADRs). Since a large number of user posts do not mention any ADR, accurate detection of the presence of ADRs in each user post is necessary before further research can be conducted. Previous feature-based methods focus on extracting more shallow linguistic features that are unable to capture deep and subtle information in the context, ultimately failing to provide satisfactory accuracy. To overcome the limitations of previous studies, this paper proposes a novel method that can extract deep linguistic features and then combine them with shallow linguistic features for ADR detection. We first extract predicate-ADR pairs under the guidance of extended syntactic dependencies and ADR lexicon. Then, we extract semantic and part-of-speech (POS) features for each pair and pool the features of different pairs to generate a holistic representation of deep linguistic features. Finally, we use the collection of deep features and several shallow features to train the predictive models. A series of experiments are performed on data sets collected from DailyStrength and Twitter. Our approach can achieve AUCs of 94.44% and 88.97% on the two data sets, respectively, outperforming other state-of-the-art methods. The results demonstrate the potential benefits of deep linguistic features for ADR detection on social data. This method can be applied to multiple other healthcare and text analysis tasks and can be used to support pharmacovigilance research.

[1]  Sarvnaz Karimi,et al.  Cadec: A corpus of adverse drug event annotations , 2015, J. Biomed. Informatics.

[2]  Xiaoyan Wang,et al.  Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study. , 2009, Journal of the American Medical Informatics Association : JAMIA.

[3]  J. Urquhart,et al.  Prescriber profile and postmarketing surveillance , 1993, The Lancet.

[4]  Ming Yang,et al.  Filtering big data from social media - Building an early warning system for adverse drug reactions , 2015, J. Biomed. Informatics.

[5]  Rohini B M Fernandopulle,et al.  What Can Consumer Adverse Drug Reaction Reporting Add to Existing Health Professional-Based Systems? , 2003, Drug safety.

[6]  Rong Xu,et al.  Large-scale combining signals from both biomedical literature and the FDA Adverse Event Reporting System (FAERS) to improve post-marketing drug safety signal detection , 2014, BMC Bioinformatics.

[7]  W. Inman,et al.  Prescriber profile and post-marketing surveillance , 1993, The Lancet.

[8]  Gillian Pearce,et al.  Prescriber profile and postmarketing surveillance , 1993, The Lancet.

[9]  Maria Kvist,et al.  Identifying adverse drug event information in clinical notes with distributional semantic representations of context , 2015, J. Biomed. Informatics.

[10]  Yaochu Jin,et al.  An improved support vector machine-based diabetic readmission prediction , 2018, Comput. Methods Programs Biomed..

[11]  P. Noyce,et al.  Hospital Admissions Associated with Adverse Drug Reactions: A Systematic Review of Prospective Observational Studies , 2008, The Annals of pharmacotherapy.

[12]  Kazuhiko Ohe,et al.  Extraction of Adverse Drug Effects from Clinical Records , 2010, MedInfo.

[13]  Laura Inés Furlong,et al.  The EU-ADR corpus: Annotated drugs, diseases, targets, and their relationships , 2012, J. Biomed. Informatics.

[14]  Raymond Chiong,et al.  An extended dictionary representation approach with deep subspace learning for facial expression recognition , 2018, Neurocomputing.

[15]  Richard B. Berlin,et al.  Predicting adverse drug events from personal health messages. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[16]  Keun Ho Ryu,et al.  Self-training in significance space of support vectors for imbalanced biomedical event data , 2015, BMC Bioinformatics.

[17]  P Ryan,et al.  Novel Data‐Mining Methodologies for Adverse Drug Event Discovery and Analysis , 2012, Clinical pharmacology and therapeutics.

[18]  Syed Rizwanuddin Ahmad,et al.  Adverse drug event monitoring at the food and drug administration , 2003, Journal of general internal medicine.

[19]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[20]  Juliane Fluck,et al.  Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports , 2012, J. Biomed. Informatics.

[21]  Zhi Jin,et al.  Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths , 2015, EMNLP.

[22]  Chuancai Liu,et al.  Action recognition by Latent Duration Model , 2018, Neurocomputing.

[23]  Lyle H. Ungar,et al.  Identifying potential adverse effects using the web: A new approach to medical hypothesis generation , 2011, J. Biomed. Informatics.

[24]  Mehrnoush Shamsfard,et al.  Using Linked Data for polarity classification of patients' experiences , 2015, J. Biomed. Informatics.

[25]  Sophia Ananiadou,et al.  Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts , 2016, J. Biomed. Informatics.

[26]  Max Petzold,et al.  Percentage of Patients with Preventable Adverse Drug Reactions and Preventability of Adverse Drug Reactions – A Meta-Analysis , 2012, PloS one.

[27]  Luca Toldo,et al.  Extraction of potential adverse drug events from medical case reports , 2012, Journal of biomedical semantics.

[28]  Abeed Sarker,et al.  Portable automatic text classification for adverse drug reaction detection via multi-corpus training , 2015, J. Biomed. Informatics.

[29]  Ran Jin,et al.  Classifying relations in clinical narratives using segment graph convolutional and recurrent neural networks (Seg-GCRNs) , 2018, J. Am. Medical Informatics Assoc..

[30]  Gang Wang,et al.  SSEL-ADE: A semi-supervised ensemble learning framework for extracting adverse drug events from social media , 2017, Artif. Intell. Medicine.