Learning and inspecting classification rules from longitudinal epidemiological data to identify predictive features on hepatic steatosis

Abstract Personalized medicine requires the analysis of epidemiological data for the identification of subgroups sharing some risk factors and exhibiting dedicated outcome risks. We investigate the potential of data mining methods for the analysis of subgroups of cohort participants on hepatic steatosis. We propose a workflow for data preparation and mining on epidemiological data and we present InteractiveRuleMiner , an interactive tool for the inspection of rules in each subpopulation, including functionalities for the juxtaposition of labeled individuals and unlabeled ones. We report on our insights on specific subpopulations that have been discovered in a data-driven rather than hypothesis-driven way.

[1]  Yoones A. Sekhavat,et al.  Visualizing Association Rules Using Linked Matrix, Graph, and Detail Views , 2013 .

[2]  Jerome H Friedman,et al.  Multiple additive regression trees with application in epidemiology , 2003, Statistics in medicine.

[3]  W. Rathmann,et al.  Cohort profile: the study of health in Pomerania. , 2011, International journal of epidemiology.

[4]  Bernhard Preim,et al.  Visual Analytics of Image-Centric Cohort Studies in Epidemiology , 2015, Visualization in Medicine and Life Sciences III.

[5]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[6]  Bernhard Preim,et al.  Can we distinguish between benign and malignant breast tumors in DCE-MRI by studying a tumor's most suspect region only? , 2013, Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems.

[7]  Henry Völzke,et al.  The association between fatty liver disease and blood pressure in a population-based prospective longitudinal study , 2010, Journal of hypertension.

[8]  Richard D Riley,et al.  Prognosis research strategy (PROGRESS) 4: Stratified medicine research , 2013, BMJ : British Medical Journal.

[9]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[10]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[11]  Anna Castiglione,et al.  The Fatty Liver Index: a simple and accurate predictor of hepatic steatosis in the general population , 2006, BMC gastroenterology.

[12]  Nuno Pombo,et al.  Knowledge discovery in clinical decision support systems for pain management: A systematic review , 2014, Artif. Intell. Medicine.

[13]  Lu Gao,et al.  Carotid intima-media thickness progression to predict cardiovascular events in the general population (the PROG-IMT collaborative project): a meta-analysis of individual participant data , 2012, The Lancet.

[14]  Henry Völzke,et al.  Association of socioeconomic status with iodine supply and thyroid disorders in northeast Germany. , 2013, Thyroid : official journal of the American Thyroid Association.

[15]  Alex Thomo,et al.  Extracting association rules from liver cancer data using the FP-growth algorithm , 2013, 2013 IEEE 3rd International Conference on Computational Advances in Bio and medical Sciences (ICCABS).

[16]  D. Harnois,et al.  Risk of Cardiovascular Disease in Patients with Nonalcoholic Fatty Liver Disease , 2011 .

[17]  Nele Friedrich,et al.  Noninvasive Quantification of Hepatic Fat Content Using Three-Echo Dixon Magnetic Resonance Imaging With Correction for T2* Relaxation Effects , 2011, Investigative radiology.

[18]  Henry Völzke,et al.  Inverse association between serum free thyroxine levels and hepatic steatosis: results from the Study of Health in Pomerania. , 2012, Thyroid : official journal of the American Thyroid Association.

[19]  Thomas Kohlmann,et al.  Menopausal status and hepatic steatosis in a general female population , 2007, Gut.

[20]  S. Reeder,et al.  Quantitative chemical shift‐encoded MRI is an accurate method to quantify hepatic steatosis , 2014, Journal of magnetic resonance imaging : JMRI.

[21]  Henry Völzke,et al.  Hepatic Steatosis Is Associated With Aortic Valve Sclerosis in the General Population: The Study of Health in Pomerania (SHIP) , 2013, Arteriosclerosis, thrombosis, and vascular biology.

[22]  Glenn Fung,et al.  A new, accurate predictive model for incident hypertension , 2013, Journal of hypertension.

[23]  Henry Völzke,et al.  The association between fatty liver disease and blood pressure in a population-based cohort study. , 2012, Journal of hypertension.

[24]  Henry Völzke,et al.  Ultrasonographic hepatic steatosis increases prediction of mortality risk from elevated serum gamma‐glutamyl transpeptidase levels , 2009, Hepatology.

[25]  Henry Völzke,et al.  Impact of fatty liver disease on health care utilization and costs in a general population: a 5-year observation. , 2008, Gastroenterology.

[26]  G. Brandi,et al.  Prevalence of and Risk Factors for Hepatic Steatosis in Northern Italy , 2000, Annals of Internal Medicine.

[27]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[28]  Michael Krawczak,et al.  Genetic variation in the PNPLA3 gene is associated with alcoholic liver injury in caucasians , 2010, Hepatology.

[29]  Chuanlei Zhang,et al.  Subpopulation-specific confidence designation for more informative biomedical classification , 2013, Artif. Intell. Medicine.

[30]  Angel Lanas,et al.  CagA-positive Helicobacter pylori infection is not associated with decreased risk of Barrett's esophagus in a population with high H. pylori infection rate , 2003, BMC gastroenterology.

[31]  Lars Linsen,et al.  Visualization in Medicine and Life Sciences III, Towards Making an Impact , 2016, Visualization in Medicine and Life Sciences III.

[32]  K E Liu,et al.  Improvement of Adequate Use of Warfarin for the Elderly Using Decision Tree-based Approaches , 2013, Methods of Information in Medicine.