Temporal Convolutional Neural Networks for Diagnosis from Lab Tests

Early diagnosis of treatable diseases is essential for improving healthcare, and many diseases' onsets are predictable from annual lab tests and their temporal trends. We introduce a multi-resolution convolutional neural network for early detection of multiple diseases from irregularly measured sparse lab values. Our novel architecture takes as input both an imputed version of the data and a binary observation matrix. For imputing the temporal sparse observations, we develop a flexible, fast to train method for differentiable multivariate kernel regression. Our experiments on data from 298K individuals over 8 years, 18 common lab measurements, and 171 diseases show that the temporal signatures learned via convolution are significantly more predictive than baselines commonly used for early disease diagnosis.

[1]  E. Nadaraya On Estimating Regression , 1964 .

[2]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[3]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[4]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[5]  Carl-Fredrik Westin,et al.  Normalized and differential convolution , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[6]  C. Roehrborn,et al.  Variability of repeated serum prostate-specific antigen (PSA) measurements within less than 90 days in a well-defined patient population. , 1996, Urology.

[7]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8]  Marcus R. Frean,et al.  Dependent Gaussian Processes , 2004, NIPS.

[9]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[10]  E. Kilpatrick,et al.  Relating mean blood glucose and glucose variability to the risk of multiple episodes of hypoglycaemia in type 1 diabetes , 2007, Diabetologia.

[11]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[12]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[13]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[14]  John P. Cunningham,et al.  Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity , 2008, NIPS.

[15]  Neil D. Lawrence,et al.  Sparse Convolved Gaussian Processes for Multi-output Regression , 2008, NIPS.

[16]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[17]  Neil D. Lawrence,et al.  Efficient Multioutput Gaussian Processes through Variational Inducing Kernels , 2010, AISTATS.

[18]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[19]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[20]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[21]  Le Song,et al.  Kernel Embeddings of Latent Tree Graphical Models , 2011, NIPS.

[22]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[23]  Gerald Penn,et al.  Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Andrew Gordon Wilson,et al.  Gaussian Process Regression Networks , 2011, ICML.

[26]  T. Lasko,et al.  Computational Phenotype Discovery Using Unsupervised Feature Learning over Noisy, Sparse, and Irregular Clinical Data , 2013, PloS one.

[27]  M. Woodward,et al.  Effects of Visit-to-Visit Variability in Systolic Blood Pressure on Macrovascular and Microvascular Complications in Patients With Type 2 Diabetes Mellitus: The ADVANCE Trial , 2013, Circulation.

[28]  Tara N. Sainath,et al.  Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[30]  Joshua B. Tenenbaum,et al.  Structure Discovery in Nonparametric Regression through Compositional Kernel Search , 2013, ICML.

[31]  Nitish Srivastava,et al.  Learning Generative Models with Visual Attention , 2013, NIPS.

[32]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[33]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[34]  E. Hing,et al.  Trends in electronic health record system use among office-based physicians: United States, 2007-2012. , 2014, National health statistics reports.

[35]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[36]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[37]  S. Bangalore,et al.  Visit-to-visit low-density lipoprotein cholesterol variability and risk of cardiovascular outcomes: insights from the TNT trial. , 2015, Journal of the American College of Cardiology.

[38]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[39]  Yan Liu,et al.  Deep Computational Phenotyping , 2015, KDD.

[40]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .