MD-MTL: An Ensemble Med-Multi-Task Learning Package for DiseaseScores Prediction and Multi-Level Risk Factor Analysis

While many machine learning methods have been used for medical prediction and risk factor analysis on healthcare data, most prior research has involved single-task learning (STL) methods. However, healthcare research often involves multiple related tasks. For instance, implementation of disease scores prediction and risk factor analysis in multiple subgroups of patients simultaneously and risk factor analysis at multi-levels synchronously. In this paper, we developed a new ensemble machine learning Python package based on multi-task learning (MTL), referred to as the Med-Multi-Task Learning (MD-MTL) package and applied it in predicting disease scores of patients, and in carrying out risk factor analysis on multiple subgroups of patients simultaneously. Our experimental results on two datasets demonstrate the utility of the MD-MTL package, and show the advantage of MTL (vs. STL), when analyzing data that is organized into different categories (tasks, which can be various age groups, different levels of disease severity, etc.).

[1]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[2]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[3]  Jelle Jolles,et al.  Behavioral Problems in Dementia: A Factor Analysis of the Neuropsychiatric Inventory , 2003, Dementia and Geriatric Cognitive Disorders.

[4]  A. Mitnitski,et al.  Nontraditional risk factors combine to predict Alzheimer disease and dementia , 2011, BDJ.

[5]  K Kasikumar,et al.  Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks , 2018 .

[6]  Soni Jyoti,et al.  Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction , 2011 .

[7]  J. Brewer,et al.  High-throughput, Fully Automated Volumetry for Prediction of MMSE and CDR Decline in Mild Cognitive Impairment , 2009, Alzheimer disease and associated disorders.

[8]  Jiayu Zhou,et al.  A multi-task learning formulation for predicting disease progression , 2011, KDD.

[9]  A. LaCroix,et al.  The relation of psychosocial dimensions of work with coronary heart disease risk factors: a meta-analysis of five United States data bases. , 1989, American journal of epidemiology.

[11]  Michael I. Jordan,et al.  Multi-task feature selection , 2006 .

[12]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[13]  Ming Dong,et al.  Prioritization of Multi-Level Risk Factors for Obesity , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[14]  Jiayu Zhou,et al.  Modeling disease progression via fused sparse group lasso , 2012, KDD.