An ensemble framework with $l_{21}$-norm regularized hypergraph laplacian multi-label learning for clinical data prediction

Previous work has shown that machine learning algorithms lend themselves to clinical decision-making and are a valuable tool for physicians. For clinical data, it is often necessary to assign multiple labels to a patient record by choosing from a large number of potential labels. A key problem in learning from multi-labelled data is how to exploit the information contained in the correlations between labels. The hypergraph-based multi-label learning method learns from data by exploiting the spectral property of the hypergraph that encodes the correlation structure of labels. However, the problem with this method is the difficulty with which interpretations can be made. This is mainly due to its inability to recognize the importance of key features in the original feature space. Moreover, it is hard to comprehensively capture the complex structure of the correlations between labels. To overcome these difficulties and improve interpretability, we propose an $l_{21}$-norm regularized Graph Laplacian multi-label learning to perform feature selection and label embedding simultaneously. In-depth experimental studies, using the publicly available Medical Information Mart for Intensive Care (MIMIC-III) database, validate the effectiveness of our approach.

[1]  Chaoyang Zhang,et al.  Multi-Label Symptom Analysis and Modeling of TCM Diagnosis of Hypertension , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[2]  Zhi-Hua Zhou,et al.  Multi-Label Learning by Exploiting Label Correlations Locally , 2012, AAAI.

[3]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[4]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[5]  Olivier Bodenreider,et al.  The digital revolution in phenotyping , 2015, Briefings Bioinform..

[6]  G. Moody,et al.  Predicting in-hospital mortality of ICU patients: The PhysioNet/Computing in cardiology challenge 2012 , 2012, 2012 Computing in Cardiology.

[7]  Feng Liu,et al.  Predicting drug side effects by multi-label learning and ensemble learning , 2015, BMC Bioinformatics.

[8]  Peter Szolovits,et al.  Predicting ICU Mortality Risk by Grouping Temporal Trends from a Multivariate Panel of Physiologic Measurements , 2016, AAAI.

[9]  Jieping Ye,et al.  Hypergraph spectral learning for multi-label classification , 2008, KDD.

[10]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[11]  Juan José del Coz,et al.  Binary relevance efficacy for multilabel classification , 2012, Progress in Artificial Intelligence.

[12]  Aram Galstyan,et al.  Multitask learning and benchmarking with clinical time series data , 2017, Scientific Data.

[13]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[14]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[15]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[16]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[17]  Gang Chen,et al.  Multi-label learning by exploiting label correlations for TCM diagnosing Parkinson's disease , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[18]  Rolf Ingold,et al.  Performance comparison of multi-label learning algorithms on clinical data for chronic diseases , 2015, Comput. Biol. Medicine.

[19]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[20]  Stefano Bromuri,et al.  Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms , 2014, J. Biomed. Informatics.

[21]  Yan Liu,et al.  Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets , 2017, ArXiv.