Mining disproportional itemsets for characterizing groups of heart failure patients from administrative health records

Heart failure is a serious medical conditions involving decreased quality of life and an increased risk of premature death. A recent evaluation by the Swedish National Board of Health and Welfare shows that Swedish heart failure patients are often undertreated and do not receive basic medication as recommended by the national guidelines for treatment of heart failure. The objective of this paper is to use registry data to characterize groups of heart failure patients, with an emphasis on basic treatment. Towards this end, we explore the applicability of frequent itemset mining and disproportionality analysis for finding interesting and distinctive characterizations of a target group of patients, e.g., those who have received basic treatment, against a control group, e.g., those who have not received basic treatment. Our empirical evaluation is performed on data extracted from administrative health records from the Stockholm County covering the years 2010--2016. Our findings suggest that frequency is not always the most appropriate measure of importance for frequent itemsets, while itemset disproportionality against a control group provides alternative rankings of the extracted itemsets leading to some medically intuitive characterizations of the target groups.

[1]  S. Evans,et al.  Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports , 2001, Pharmacoepidemiology and drug safety.

[2]  Jilles Vreeken,et al.  Summarizing data succinctly with the most informative itemsets , 2012, TKDD.

[3]  A. Bate,et al.  Extending the methods used to screen the WHO drug safety database towards analysis of complex associations and improved accuracy for rare events , 2006, Statistics in medicine.

[4]  Jiawei Han,et al.  Discovering interesting patterns through user's interactive feedback , 2006, KDD '06.

[5]  Stefan Wrobel,et al.  Efficient discovery of interesting patterns based on strong closedness , 2009 .

[6]  Rajjan Shinghal,et al.  Evaluating the Interestingness of Characteristic Rules , 1996, KDD.

[7]  Panagiotis Papapetrou,et al.  Mining candidates for adverse drug interactions in electronic patient records , 2014, PETRA '14.

[8]  Paul R Kalra,et al.  Drug therapy for heart failure in older patients—what do they want? , 2015, Journal of geriatric cardiology : JGC.

[9]  William DuMouchel,et al.  Bayesian Data Mining in Large Frequency Tables, with an Application to the FDA Spontaneous Reporting System , 1999 .

[10]  Thomas Kahan,et al.  The epidemiology of heart failure, based on data for 2.1 million inhabitants in Sweden , 2013, European journal of heart failure.

[11]  Jilles Vreeken,et al.  Tell me what i need to know: succinctly summarizing data with itemsets , 2011, KDD.

[12]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[13]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[14]  Isak Karlsson,et al.  Applying Methods for Signal Detection in Spontaneous Reports to Electronic Patient Records , 2013, KDD 2013.

[15]  Vipin Kumar,et al.  Generalizing the notion of confidence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[16]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[17]  A. Bate,et al.  A Bayesian neural network method for adverse drug reaction signal generation , 1998, European Journal of Clinical Pharmacology.

[18]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[19]  Geoffrey I. Webb Discovering significant rules , 2006, KDD '06.

[20]  Tao Li,et al.  Skopus: Mining top-k sequential patterns under leverage , 2015, Data Mining and Knowledge Discovery.

[21]  Edward Omiecinski,et al.  Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[22]  Pang-Ning Tan,et al.  Interestingness Measures for Association Patterns : A Perspective , 2000, KDD 2000.

[23]  M. Lindquist,et al.  A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions , 2002, Pharmacoepidemiology and drug safety.

[24]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.