Development of an automated assessment tool for MedWatch reports in the FDA adverse event reporting system

Objective As the US Food and Drug Administration (FDA) receives over a million adverse event reports associated with medication use every year, a system is needed to aid FDA safety evaluators in identifying reports most likely to demonstrate causal relationships to the suspect medications. We combined text mining with machine learning to construct and evaluate such a system to identify medication-related adverse event reports. Methods FDA safety evaluators assessed 326 reports for medication-related causality. We engineered features from these reports and constructed random forest, L1 regularized logistic regression, and support vector machine models. We evaluated model accuracy and further assessed utility by generating report rankings that represented a prioritized report review process. Results Our random forest model showed the best performance in report ranking and accuracy, with an area under the receiver operating characteristic curve of 0.66. The generated report ordering assigns reports with a higher probability of medication-related causality a higher rank and is significantly correlated to a perfect report ordering, with a Kendall's tau of 0.24 ( P  = .002). Conclusion Our models produced prioritized report orderings that enable FDA safety evaluators to focus on reports that are more likely to contain valuable medication-related adverse event information. Applying our models to all FDA adverse event reports has the potential to streamline the manual review process and greatly reduce reviewer workload.

[1]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[2]  Joseph M. Tonning,et al.  Perspectives on the Use of Data Mining in Pharmacovigilance , 2005, Drug safety.

[3]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[4]  Hongfang Liu,et al.  Standardizing adverse drug event reporting data , 2014, J. Biomed. Semant..

[5]  Taxiarchis Botsis,et al.  Application of Information Retrieval Approaches to Case Classification in the Vaccine Adverse Event Reporting System , 2013, Drug Safety.

[6]  Rong Xu,et al.  Large-scale combining signals from both biomedical literature and the FDA Adverse Event Reporting System (FAERS) to improve post-marketing drug safety signal detection , 2014, BMC Bioinformatics.

[7]  L. Leape Reporting of adverse events. , 2002, The New England journal of medicine.

[8]  Robert Ball,et al.  Vaccine adverse event text mining system for extracting features from vaccine safety reports , 2012, J. Am. Medical Informatics Assoc..

[9]  Zina M. Ibrahim,et al.  ADEPt, a semantically-enriched pipeline for extracting adverse drug events from free-text electronic health records , 2017, PloS one.

[10]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[11]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[12]  David A. Kessler,et al.  Introducing MEDWatch. A new approach to reporting medication and device adverse effects and product problems. , 1994 .

[13]  G. D. Dal Pan,et al.  Evaluation of FDA safety‐related drug label changes in 2010 , 2013, Pharmacoepidemiology and drug safety.

[14]  Stephanie J. Reisinger,et al.  Drug-versus-Drug Adverse Event Rate Comparisons , 2009, Drug safety.

[15]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[16]  E. Brown,et al.  The Medical Dictionary for Regulatory Activities (MedDRA) , 1999, Drug safety.

[17]  Emanuel Raschi,et al.  Drug‐induced torsades de pointes: data mining of the public version of the FDA Adverse Event Reporting System (AERS) , 2009, Pharmacoepidemiology and drug safety.

[18]  Carol Friedman,et al.  Mining multi-item drug adverse effect associations in spontaneous reporting systems , 2010, BMC Bioinformatics.

[19]  Max Kuhn,et al.  caret: Classification and Regression Training , 2015 .

[20]  T Botsis,et al.  Can Natural Language Processing Improve the Efficiency of Vaccine Adverse Event Report Review? , 2015, Methods of Information in Medicine.

[21]  R. Altman,et al.  Data-Driven Prediction of Drug Effects and Interactions , 2012, Science Translational Medicine.

[22]  Marianthi Markatou,et al.  Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection , 2011, J. Am. Medical Informatics Assoc..

[23]  Marc Boyer,et al.  Use of data mining at the Food and Drug Administration , 2016, J. Am. Medical Informatics Assoc..

[24]  T Botsis,et al.  The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barré syndrome reports. , 2013, Applied clinical informatics.

[25]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .