Temporal pattern discovery in longitudinal electronic patient records

Large collections of electronic patient records provide a vast but still underutilised source of information on the real world use of medicines. They are maintained primarily for the purpose of patient administration, but contain a broad range of clinical information highly relevant for data analysis. While they are a standard resource for epidemiological confirmatory studies, their use in the context of exploratory data analysis is still limited. In this paper, we present a framework for open-ended pattern discovery in large patient records repositories. At the core is a graphical statistical approach to summarising and visualising the temporal association between the prescription of a drug and the occurrence of a medical event. The graphical overview contrasts the observed and expected number of occurrences of the medical event in different time periods both before and after the prescription of interest. In order to effectively screen for important temporal relationships, we introduce a new measure of temporal association, which contrasts the observed-to-expected ratio in a time period immediately after the prescription to the observed-to-expected ratio in a control period 2 years earlier. An important feature of both the observed-to-expected graph and the measure of temporal association is a statistical shrinkage towards the null hypothesis of no association, which provides protection against highlighting spurious associations. We demonstrate the usefulness of the proposed pattern discovery methodology by a set of examples from a collection of over two million patient records in the United Kingdom. The identified patterns include temporal relationships between drug prescriptions and medical events suggestive of persistent and transient risks of adverse events, possible beneficial effects of drugs, periodic co-occurrence, and systematic tendencies of patients to switch from one medication to another.

[1]  Jie Chen,et al.  Mining Unexpected Temporal Associations: Applications in Detecting Adverse Drug Reactions , 2008, IEEE Transactions on Information Technology in Biomedicine.

[2]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[3]  M. Lindquist,et al.  A Retrospective Evaluation of a Data Mining Approach to Aid Finding New Adverse Drug Reaction Signals in the WHO International Database , 2000, Drug safety.

[4]  M. Kulldorff,et al.  Early detection of adverse drug events within population‐based health networks: application of sequential testing methods , 2007, Pharmacoepidemiology and drug safety.

[5]  C. Paddy Farrington,et al.  Sequential case series analysis for pharmacovigilance , 2009 .

[6]  Meredith Wadman,et al.  Experts call for active surveillance of drug safety , 2007, Nature.

[7]  A. Bate,et al.  Extending the methods used to screen the WHO drug safety database towards analysis of complex associations and improved accuracy for rare events , 2006, Statistics in medicine.

[8]  Kai Puolamäki,et al.  Introduction to the special issue on visual analytics and knowledge discovery , 2010, SKDD.

[9]  William DuMouchel,et al.  Bayesian Data Mining in Large Frequency Tables, with an Application to the FDA Spontaneous Reporting System , 1999 .

[10]  M. Pirmohamed,et al.  Adverse drug reactions as cause of admission to hospital: prospective analysis of 18 820 patients , 2004, BMJ : British Medical Journal.

[11]  G. Niklas Norén,et al.  Temporal pattern discovery for trends and transient effects: its application to patient records , 2008, KDD.

[12]  C P Farrington,et al.  Relative incidence estimation from case series for vaccine safety evaluation. , 1995, Biometrics.

[13]  G. Niklas Norén,et al.  Adjustment for potential confounders in adverse drug reaction surveillance , 2007 .

[14]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[15]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[16]  A. Bate,et al.  A Bayesian neural network method for adverse drug reaction signal generation , 1998, European Journal of Clinical Pharmacology.

[17]  G. Niklas Norén,et al.  Duplicate detection in adverse drug reaction surveillance , 2007, Data Mining and Knowledge Discovery.

[18]  R. Sundberg,et al.  A statistical methodology for drug–drug interaction surveillance , 2008, Statistics in medicine.

[19]  Johan Hopstadius,et al.  Impact of Stratification on Adverse Drug Reaction Surveillance , 2008, Drug safety.