Learning Hidden Patterns from Patient Multivariate Time Series Data Using Convolutional Neural Networks: A Case Study of Healthcare Cost Prediction

OBJECTIVE To develop an effective and scalable individual-level patient cost prediction method by automatically learning hidden temporal patterns from multivariate time series data in patient insurance claims using a convolutional neural network (CNN) architecture. METHODS We used three years of medical and pharmacy claims data from 2013 to 2016 from a healthcare insurer, where data from the first two years were used to build the model to predict costs in the third year. The data consisted of the multivariate time series of cost, visit and medical features that were shaped as images of patients' health status (i.e., matrices with time windows on one dimension and the medical, visit and cost features on the other dimension). Patients' multivariate time series images were given to a CNN method with a proposed architecture. After hyper-parameter tuning, the proposed architecture consisted of three building blocks of convolution and pooling layers with an LReLU activation function and a customized kernel size at each layer for healthcare data. The proposed CNN learned temporal patterns became inputs to a fully connected layer. We benchmarked the proposed method against three other methods: 1) a spike temporal pattern detection method, as the most accurate method for healthcare cost prediction described to date in the literature; 2) a symbolic temporal pattern detection method, as the most common approach for leveraging healthcare temporal data; and 3) the most commonly used CNN architectures for image pattern detection (i.e., AlexNet, VGGNet and ResNet) (via transfer learning). Moreover, we assessed the contribution of each type of data (i.e., cost, visit and medical). Finally, we externally validated the proposed method against a separate cohort of patients. All prediction performances were measured in terms of mean absolute percentage error (MAPE). RESULTS The proposed CNN configuration outperformed the spike temporal pattern detection and symbolic temporal pattern detection methods with a MAPE of 1.67 versus 2.02 and 3.66, respectively (p<0.01). The proposed CNN outperformed ResNet, AlexNet and VGGNet with MAPEs of 4.59, 4.85 and 5.06, respectively (p<0.01). Removing medical, visit and cost features resulted in MAPEs of 1.98, 1.91 and 2.04, respectively (p<0.01). CONCLUSIONS Feature learning through the proposed CNN configuration significantly improved individual-level healthcare cost prediction. The proposed CNN was able to outperform temporal pattern detection methods that look for a pre-defined set of pattern shapes, since it is capable of extracting a variable number of patterns with various shapes. Temporal patterns learned from medical, visit and cost data made significant contributions to the prediction performance. Hyper-parameter tuning showed that considering three-month data patterns has the highest prediction accuracy. Our results showed that patients' images extracted from multivariate time series data are different from regular images, and hence require unique designs of CNN architectures. The proposed method for converting multivariate time series data of patients into images and tuning them for convolutional learning could be applied in many other healthcare applications with multivariate time series data.

[1]  Ping Zhang,et al.  Risk Prediction with Electronic Health Records: A Deep Learning Approach , 2016, SDM.

[2]  HangSiang Thye,et al.  Bi-linearly weighted fractional max pooling , 2017 .

[3]  Masaki Aono,et al.  Bi-linearly weighted fractional max pooling , 2017, Multimedia Tools and Applications.

[4]  Vijayan K. Asari,et al.  The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches , 2018, ArXiv.

[5]  Yuval Shahar,et al.  Fast time intervals mining using the transitivity of temporal relations , 2013, Knowledge and Information Systems.

[6]  Olivia R. Liu Sheng,et al.  Healthcare cost prediction: Leveraging fine-grain temporal patterns , 2019, J. Biomed. Informatics.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  M. Rosenthal,et al.  Examining A Health Care Price Transparency Tool: Who Uses It, And How They Shop For Care. , 2016, Health affairs.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Shan Sung Liew,et al.  Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems , 2016, Neurocomputing.

[12]  Yi Zheng,et al.  Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks , 2014, WAIM.

[13]  I. Duncan,et al.  Testing Alternative Regression Frameworks for Predictive Modeling of Health Care Costs , 2016 .

[14]  Yuval Shahar,et al.  Classification of multivariate time series via temporal abstraction and time intervals mining , 2015, Knowledge and Information Systems.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Kensaku Kawamoto,et al.  Supervised Learning Methods for Predicting Healthcare Costs: Systematic Literature Review and Empirical Evaluation , 2017, AMIA.

[17]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[18]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[19]  Dimitrios Gunopulos,et al.  Mining frequent arrangements of temporal intervals , 2009, Knowledge and Information Systems.

[20]  Elmar Nöth,et al.  Deep Learning Approach to Parkinson’s Disease Detection Using Voice Recordings and Convolutional Neural Network Dedicated to Image Classification , 2019, 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[21]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[22]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[23]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[25]  Richard A Armstrong,et al.  When to use the Bonferroni correction , 2014, Ophthalmic & physiological optics : the journal of the British College of Ophthalmic Opticians.

[26]  Edward W. Frees,et al.  Actuarial Applications of Multivariate Two-Part Regression Models , 2013, Annals of Actuarial Science.

[27]  Evert de Jonge,et al.  Analysis of ICU Patients Using the Time Series Knowledge Mining Method , 2007 .

[28]  Jochen Gensichen,et al.  Effects of multiple chronic conditions on health care costs: an analysis based on an advanced tree-based regression model , 2013, BMC Health Services Research.

[29]  Chia-Hsuin Chang,et al.  Predicting Healthcare Utilization Using a Pharmacy-based Metric With the WHO’s Anatomic Therapeutic Chemical Algorithm , 2011, Medical care.

[30]  Martine De Cock,et al.  Population Cost Prediction on Public Healthcare Datasets , 2015, Digital Health.

[31]  Milos Hauskrecht,et al.  A Pattern Mining Approach for Classifying Multivariate Temporal Data , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[32]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[33]  Chih-Ping Wei,et al.  Nearest-neighbor-based approach to time-series classification , 2012, Decis. Support Syst..

[34]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[35]  Milos Hauskrecht,et al.  Mining recent temporal patterns for event detection in multivariate time series data , 2012, KDD.

[36]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[37]  Micah B. Hartman,et al.  National Health Care Spending In 2017: Growth Slows To Post-Great Recession Rates; Share Of GDP Stabilizes. , 2019, Health affairs.

[38]  Milos Hauskrecht,et al.  A temporal pattern mining approach for classifying electronic health record data , 2013, ACM Trans. Intell. Syst. Technol..

[39]  Philippe Burlina,et al.  Comparing humans and deep learning performance for grading AMD: A study in using universal deep features and transfer learning for automated AMD analysis , 2017, Comput. Biol. Medicine.

[40]  Evert de Jonge,et al.  Temporal abstraction for feature extraction: A comparative case study in prediction from intensive care monitoring data , 2007, Artif. Intell. Medicine.

[41]  Daniel S. Kermany,et al.  Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning , 2018, Cell.

[42]  Yuval Shahar,et al.  Medical Temporal-Knowledge Discovery via Temporal Abstraction , 2009, AMIA.

[43]  Ming Yang,et al.  Classification of Alzheimer’s Disease Based on Eight-Layer Convolutional Neural Network with Leaky Rectified Linear Unit and Max Pooling , 2018, Journal of Medical Systems.

[44]  Yuval Shahar,et al.  Consistent discovery of frequent interval-based temporal patterns in chronic patients' data , 2017, J. Biomed. Informatics.

[45]  Yann LeCun,et al.  Comparing SVM and convolutional networks for epileptic seizure prediction from intracranial EEG , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.

[46]  Fei Wang,et al.  Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach , 2012, KDD.

[47]  Fabian Mörchen,et al.  Algorithms for time series knowledge mining , 2006, KDD '06.

[48]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[49]  Stefan Winkler,et al.  Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning , 2015, ICMI.

[50]  Nigel H. Lovell,et al.  Analyzing health insurance claims on different timescales to predict days in hospital , 2016, J. Biomed. Informatics.

[51]  Ping Zhang,et al.  Integrating Temporal Pattern Mining in Ischemic Stroke Prediction and Treatment Pathway Discovery for Atrial Fibrillation , 2017, CRI.

[52]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[53]  Santosh S. Vempala,et al.  Algorithmic Prediction of Health-Care Costs , 2008, Oper. Res..

[54]  U. Rajendra Acharya,et al.  Automated detection of diabetic subject using pre-trained 2D-CNN models with frequency spectrum images extracted from heart rate signals , 2019, Comput. Biol. Medicine.

[55]  Hayaru Shouno,et al.  Analysis of function of rectified linear unit used in deep learning , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).