Learning Tasks for Multitask Learning: Heterogenous Patient Populations in the ICU

Machine learning approaches have been effective in predicting adverse outcomes in different clinical settings. These models are often developed and evaluated on datasets with heterogeneous patient populations. However, good predictive performance on the aggregate population does not imply good performance for specific groups. In this work, we present a two-step framework to 1) learn relevant patient subgroups, and 2) predict an outcome for separate patient populations in a multi-task framework, where each population is a separate task. We demonstrate how to discover relevant groups in an unsupervised way with a sequence-to-sequence autoencoder. We show that using these groups in a multi-task framework leads to better predictive performance of in-hospital mortality both across groups and overall. We also highlight the need for more granular evaluation of performance when dealing with heterogeneous populations.

[1]  David Sontag,et al.  Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests , 2016, MLHC.

[2]  D. Sculley,et al.  No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World , 2017, 1711.08536.

[3]  Jason Roy,et al.  Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches , 2010, Medical care.

[4]  Peter Szolovits,et al.  Understanding vasopressor intervention and weaning: risk prediction in a public heterogeneous clinical time series database , 2017, J. Am. Medical Informatics Assoc..

[5]  Peter Szolovits,et al.  The Use of Autoencoders for Discovering Patient Phenotypes , 2017, ArXiv.

[6]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[7]  Fei Wang,et al.  Exploring Joint Disease Risk Prediction , 2014, AMIA.

[8]  Franck Dernoncourt,et al.  Comparing Rule-Based and Deep Learning Models for Patient Phenotyping , 2017, ArXiv.

[9]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[10]  S. Lemeshow,et al.  A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study. , 1993, JAMA.

[11]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[12]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[13]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[14]  Adler J. Perotte,et al.  Learning probabilistic phenotypes from heterogeneous EHR data , 2015, J. Biomed. Informatics.

[15]  Hans Kromhout,et al.  Hierarchical Regression for Multiple Comparisons in a Case-Control Study of Occupational Risks for Lung Cancer , 2012, PloS one.

[16]  Z. Obermeyer,et al.  Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. , 2016, The New England journal of medicine.

[17]  Yan Liu,et al.  Deep Computational Phenotyping , 2015, KDD.

[18]  A. Azzouz 2011 , 2020, City.

[19]  Jenna Wiens,et al.  Patient Risk Stratification with Time-Varying Parameters: A Multitask Learning Approach , 2016, J. Mach. Learn. Res..

[20]  M. Ghassemi,et al.  State of the art review: the data revolution in critical care , 2015, Critical Care.

[21]  Holger J Schünemann,et al.  Mortality predictions in the intensive care unit: Comparing physicians with scoring systems* , 2006, Critical care medicine.

[22]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[23]  Jimeng Sun,et al.  Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization , 2014, KDD.

[24]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[25]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[28]  David Sontag,et al.  Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests , 2016, ArXiv.

[29]  Peter Szolovits,et al.  Clinical Intervention Prediction and Understanding using Deep Networks , 2017, ArXiv.

[30]  Peter Szolovits,et al.  Predicting Clinical Outcomes Across Changing Electronic Health Record Systems , 2017, KDD.

[31]  Patrick B. Ryan,et al.  Hierarchical models for multiple, rare outcomes using massive observational healthcare databases , 2016, Stat. Anal. Data Min..

[32]  Jyotishman Pathak,et al.  Multi-task learning with selective cross-task transfer for predicting bleeding and other important patient outcomes , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[33]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[34]  Jiayu Zhou,et al.  FORMULA: FactORized MUlti-task LeArning for task discovery in personalized medical models , 2015, SDM.

[35]  C. Perucci,et al.  Use of hierarchical models to evaluate performance of cardiac surgery centres in the Italian CABG outcome study , 2007, BMC medical research methodology.

[36]  Peter Szolovits,et al.  Predicting intervention onset in the ICU with switching state space models , 2017, CRI.

[37]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[38]  Hisashi Kashima,et al.  Learning Implicit Tasks for Patient-Specific Risk Modeling in ICU , 2017, AAAI.

[39]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[40]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[41]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[42]  Jyotishman Pathak,et al.  A Heterogeneous Multi-Task Learning for Predicting RBC Transfusion and Perioperative Outcomes , 2015, Conference on Artificial Intelligence in Medicine in Europe.

[43]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[44]  Anna Rumshisky,et al.  Unfolding physiological state: mortality modelling in intensive care units , 2014, KDD.

[45]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[46]  Jimeng Sun,et al.  Limestone: High-throughput candidate phenotype generation via tensor factorization , 2014, J. Biomed. Informatics.

[47]  Aram Galstyan,et al.  Multitask learning and benchmarking with clinical time series data , 2017, Scientific Data.

[48]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[49]  Mihaela van der Schaar,et al.  Personalized Risk Scoring for Critical Care Patients using Mixtures of Gaussian Process Experts , 2016, ArXiv.