Clustering of Adverse Events of Post-Market Approved Drugs

Adverse side effects of a drug may vary over space and time due to different populations, environments, and drug quality. Discovering all side effects during the development process is impossible. Once a drug is approved, observed adverse effects are reported by doctors and patients and made available in the Adverse Event Reporting System provided by the U.S. Food and Drug Administration . Mining such records of reported adverse effects, this study proposes a spatial clustering approach to identify regions that exhibit similar adverse effects. We apply a topic modeling approach on textual representations of reported adverse effects using Latent Dirichlet Allocation. By describing a spatial region as a mixture of the resulting latent topics, we find clusters of regions that exhibit similar (topics of) adverse events for the same drug using Hierarchical Agglomerative Clustering. We investigate the resulting clusters for spatial autocorrelation to test the hypothesis that certain (topics of) adverse effects may occur only in certain spatial regions using Moran’s I measure of spatial autocorrelation. Our experimental evaluation exemplary applies our proposed framework to a number of blood-thinning drugs, showing that some drugs exhibit more coherent textual topics among their reported adverse effects than other drugs, but showing no significant spatial autocorrelation of these topics. Our approach can be applied to other drugs or vaccines to study if spatially localized adverse effects may justify further investigation.

[1]  Hans-Peter Kriegel,et al.  DBSCAN Revisited, Revisited , 2017, ACM Trans. Database Syst..

[2]  Satya Prakash Saraswat,et al.  Healthcare RFID In Germany: An Integrated Pharmaceutical Supply Chain Perspective , 2014 .

[3]  Shailendra Kadre,et al.  Introduction to Statistical Analysis , 2015 .

[4]  Lisa M. Lee,et al.  Strengthening Global Public Health Surveillance through Data and Benefit Sharing , 2018, Emerging Infectious Diseases.

[5]  C. Dunnett A Multiple Comparison Procedure for Comparing Several Treatments with a Control , 1955 .

[6]  S. Halvorsen,et al.  Comparison of dabigatran, rivaroxaban, and apixaban for effectiveness and safety in atrial fibrillation: a nationwide cohort study , 2020, European heart journal. Cardiovascular pharmacotherapy.

[7]  G. Anusha,et al.  Pharmacovigilance: A Worldwide Master Key for Drug Safety Monitoring , 2010, Journal of young pharmacists : JYP.

[8]  Francisco Gilberto Fernandes Pereira,et al.  Environmental variables and errors in the preparation and administration of medicines. , 2018, Revista brasileira de enfermagem.

[9]  D A Kessler,et al.  Introducing MEDWatch. A new approach to reporting medication and device adverse effects and product problems. , 1993, General hospital psychiatry.

[10]  J. Kang,et al.  Nurse-perceived Patient Adverse Events and Nursing Practice Environment , 2014, Journal of preventive medicine and public health = Yebang Uihakhoe chi.

[11]  Jure Leskovec,et al.  Modeling polypharmacy side effects with graph convolutional networks , 2018, bioRxiv.

[12]  Joon-Seok Kim,et al.  Fine-Grained Diversification of Proximity Constrained Queries on Road Networks , 2019, SSTD.

[13]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[14]  Dale J. Hu,et al.  Racial and Ethnic Disparities in Adverse Drug Events: A Systematic Review of the Literature , 2015, Journal of Racial and Ethnic Health Disparities.

[15]  Stephanie J. Reisinger,et al.  Using Data Mining to Predict Safety Actions from FDA Adverse Event Reporting System Data , 2007 .

[16]  Marc Boyer,et al.  Use of data mining at the Food and Drug Administration , 2016, J. Am. Medical Informatics Assoc..

[17]  E. Brown,et al.  The Medical Dictionary for Regulatory Activities (MedDRA) , 1999, Drug safety.

[18]  David A. Kessler Introducing MEDWatch: A New Approach to Reporting Medication and Device Adverse Effects and Product Problems , 1993 .

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Matthias Reumann,et al.  Use of big data for drug development and for public and personal health and care , 2017, Genetic Epidemiology.

[21]  Sergio J. Rey,et al.  PySAL: A Python Library of Spatial Analytical Methods , 2010 .

[22]  S. Willems,et al.  Social disparities in patient safety in primary care: a systematic review , 2018, International Journal for Equity in Health.

[23]  Rui Zhang,et al.  Mining Adverse Events of Dietary Supplements from Product Labels by Topic Modeling , 2018, MedInfo.

[24]  David Kauchak,et al.  Modeling word burstiness using the Dirichlet distribution , 2005, ICML.

[25]  P. Noseworthy,et al.  Direct Comparison of Dabigatran, Rivaroxaban, and Apixaban for Effectiveness and Safety in Nonvalvular Atrial Fibrillation. , 2016, Chest.

[26]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[27]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[28]  P. Moran Notes on continuous stochastic phenomena. , 1950, Biometrika.

[29]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[30]  S. Weingart,et al.  Racial and Ethnic Disparities in Patient Safety , 2017, Journal of patient safety.