Temporal Poisson Square Root Graphical Models

We propose temporal Poisson square root graphical models (TPSQRs), a generalization of Poisson square root graphical models (PSQRs) specifically designed for modeling longitudinal event data. By estimating the temporal relationships for all possible pairs of event types, TPSQRs can offer a holistic perspective about whether the occurrences of any given event type could excite or inhibit any other type. A TPSQR is learned by estimating a collection of interrelated PSQRs that share the same template parameterization. These PSQRs are estimated jointly in a pseudo-likelihood fashion, where Poisson pseudo-likelihood is used to approximate the original more computationally-intensive pseudo-likelihood problem stemming from PSQRs. Theoretically, we demonstrate that under mild assumptions, the Poisson pseudo-likelihood approximation is sparsistent for recovering the underlying PSQR. Empirically, we learn TPSQRs from Marshfield Clinic electronic health records (EHRs) with millions of drug prescription and condition diagnosis events, for adverse drug reaction (ADR) detection. Experimental results demonstrate that the learned TPSQRs can recover ADR signals from the EHR effectively and efficiently.

[1]  David Madigan,et al.  Detecting adverse drug reactions following long-term exposure in longitudinal observational data: The exposure-adjusted self-controlled case series , 2016, Statistical methods in medical research.

[2]  Chunming Zhang,et al.  Multiple testing under dependence via graphical models , 2016 .

[3]  Yu-Chuan Li,et al.  Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers , 2015, MedInfo.

[4]  Puyang Xu,et al.  A Model for Temporal Dependencies in Event Streams , 2011, NIPS.

[5]  Pradeep Ravikumar,et al.  Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies , 2016, ICML.

[6]  Utkarsh Upadhyay,et al.  Recurrent Marked Temporal Point Processes: Embedding Event History to Vector , 2016, KDD.

[7]  David Page,et al.  Learning Heterogeneous Hidden Markov Random Fields , 2014, AISTATS.

[8]  Larry A. Wasserman,et al.  SpAM: Sparse Additive Models , 2007, NIPS.

[9]  E. Burnside,et al.  New Genetic Variants Improve Personalized Breast Cancer Diagnosis , 2014, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[10]  Stéphane Mallat,et al.  A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .

[11]  David Page,et al.  Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data , 2017, KDD.

[12]  R. Tibshirani,et al.  Strong rules for discarding predictors in lasso‐type problems , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[13]  Pradeep Ravikumar,et al.  On Poisson Graphical Models , 2013, NIPS.

[14]  David Page,et al.  Genetic Variants Improve Breast Cancer Risk Prediction on Mammograms , 2013, AMIA.

[15]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[16]  Jie Liu,et al.  Stochastic Learning for Sparse Discrete Markov Random Fields with Controlled Gradient Approximation Error , 2018, UAI.

[17]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[18]  David Page,et al.  A Screening Rule for l1-Regularized Ising Model Estimation , 2017, NIPS.

[19]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[20]  Pradeep Ravikumar,et al.  Fixed-Length Poisson MRF: Adding Dependencies to the Multinomial , 2015, NIPS.

[21]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[22]  Sriraam Natarajan,et al.  Multiplicative Forests for Continuous-Time Processes , 2012, NIPS.

[23]  David Page,et al.  Baseline Regularization for Computational Drug Repositioning with Longitudinal Observational Data , 2016, IJCAI.

[24]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[25]  P Ryan,et al.  Novel Data‐Mining Methodologies for Adverse Drug Event Discovery and Analysis , 2012, Clinical pharmacology and therapeutics.

[26]  Peter Wonka,et al.  Fused Multiple Graphical Lasso , 2012, SIAM J. Optim..

[27]  David Page,et al.  Bayesian Estimation of Latently-grouped Parameters in Undirected Graphical Models , 2013, NIPS.

[28]  David Page,et al.  Computational Drug Repositioning Using Continuous Self-Controlled Case Series , 2016, KDD.

[29]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[30]  R. Altman,et al.  Data-Driven Prediction of Drug Effects and Interactions , 2012, Science Translational Medicine.

[31]  Janet Sultana,et al.  Clinical and economic burden of adverse drug reactions , 2013, Journal of pharmacology & pharmacotherapeutics.

[32]  Sinong Geng,et al.  An Efficient Pseudo-likelihood Method for Sparse Binary Pairwise Markov Network Estimation , 2017, 1702.08320.

[33]  Rebecca Willett,et al.  Hawkes Process Modeling of Adverse Drug Reactions with Longitudinal Observational Data , 2017, MLHC.

[34]  Shawn E. Simpson Self-controlled methods for postmarketing drug safety surveillance in large-scale longitudinal data , 2011 .

[35]  David Page,et al.  Forest-Based Point Process for Event Prediction from Electronic Health Records , 2013, ECML/PKDD.

[36]  Andrew Bate,et al.  The hope, hype and reality of Big Data for pharmacovigilance , 2018, Therapeutic advances in drug safety.

[37]  Genevera I. Allen,et al.  A Local Poisson Graphical Model for Inferring Networks From Sequencing Data , 2013, IEEE Transactions on NanoBioscience.

[38]  Le Song,et al.  Estimating time-varying networks , 2008, ISMB 2008.

[39]  Pradeep Ravikumar,et al.  On the Use of Variational Inference for Learning Discrete Graphical Model , 2011, ICML.

[40]  David Madigan,et al.  Multiple Self‐Controlled Case Series for Large‐Scale Longitudinal Observational Databases , 2013, Biometrics.