Personalized Predictive Modeling and Risk Factor Identification using Patient Similarity

Personalized predictive models are customized for an individual patient and trained using information from similar patients. Compared to global models trained on all patients, they have the potential to produce more accurate risk scores and capture more relevant risk factors for individual patients. This paper presents an approach for building personalized predictive models and generating personalized risk factor profiles. A locally supervised metric learning (LSML) similarity measure is trained for diabetes onset and used to find clinically similar patients. Personalized risk profiles are created by analyzing the parameters of the trained personalized logistic regression models. A 15,000 patient data set, derived from electronic health records, is used to evaluate the approach. The predictive results show that the personalized models can outperform the global model. Cluster analysis of the risk profiles show groups of patients with similar risk factors, differences in the top risk factors for different groups of patients and differences between the individual and global risk factors.

[1]  Mong-Li Lee,et al.  SNNB: A Selective Neighborhood Based Naïve Bayes for Lazy Learning , 2002, PAKDD.

[2]  M. West,et al.  Integrated modeling of clinical and gene expression information for personalized prediction of disease outcomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Nikola K. Kasabov,et al.  Global, local and personalised modeling and pattern discovery in bioinformatics: An integrated approach , 2007, Pattern Recognit. Lett..

[4]  Fei Wang,et al.  Two Heads Better Than One: Metric+Active Learning and its Applications for IT Service Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[5]  Shyam Visweswaran,et al.  Learning patient-specific predictive models from clinical data , 2010, J. Biomed. Informatics.

[6]  Nikola K. Kasabov,et al.  Integrated optimisation method for personalised modelling and case studies for medical decision support , 2010, Int. J. Funct. Informatics Pers. Medicine.

[7]  Y. Tabak,et al.  An Automated Model to Identify Heart Failure Patients at Risk for 30-Day Readmission or Death Using Electronic Medical Record Data , 2010, Medical care.

[8]  Andrew J Vickers,et al.  Prediction models in cancer care , 2011, CA: a cancer journal for clinicians.

[9]  Ralph Snyderman,et al.  Personalized health care: From theory to practice , 2012, Biotechnology journal.

[10]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[11]  Darcy A. Davis,et al.  Bringing Big Data to Personalized Healthcare: A Patient-Centered Framework , 2013, Journal of General Internal Medicine.

[12]  George Hripcsak,et al.  A collaborative approach to developing an electronic health record phenotyping algorithm for drug-induced liver injury. , 2013, Journal of the American Medical Informatics Association : JAMIA.