DensityTransfer: A Data Driven Approach for Imputing Electronic Health Records

Patient Electronic Health Records (EHR) are systematic collection of electronic patient health information including demographics, diagnosis, medication, procedure, lab tests, etc. Because of the rapid development of hardware and storage technologies, more and more EHRs become available and they are now serving as the basis for a lot of medical informatics applications, such as predictive modeling, patient risk stratification and care pathway analysis. One major challenge or working with EHR is sparsity. This is because patients will only have EHR recorded when they paid visits to clinical facilities. However, the patients typically will not pay frequent visits to those clinical sites unless they are severely sick and need intensive monitoring. In this paper, we propose Density Transfer, a data driven approach for imputing the sparse patient EHRs. As its name suggests, the idea is to transfer knowledge from patients with denser EHRs to their similar patients with sparse EHRs. We formulate Density Transfer as an optimization problem and propose an efficient block coordinate descent based approach to solve it.

[1]  Fei Wang,et al.  From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records , 2014, KDD.

[2]  Jimeng Sun,et al.  Predicting changes in hypertension control using electronic health records from a chronic disease management program , 2014, J. Am. Medical Informatics Assoc..

[3]  Fei Wang,et al.  Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach , 2012, KDD.

[4]  Yin Zhang,et al.  Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm , 2012, Mathematical Programming Computation.

[5]  Fei Wang,et al.  A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[7]  Jason Roy,et al.  Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches , 2010, Medical care.

[8]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[9]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[10]  M. Cheitlin,et al.  BNP-Guided vs Symptom-Guided Heart Failure Therapy: The Trial of Intensified vs Standard Medical Therapy in Elderly Patients With Congestive Heart Failure (TIME-CHF) Randomized Trial , 2010 .

[11]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[12]  Michael A. Saunders,et al.  Proximal Newton-type methods for convex optimization , 2012, NIPS.