Development and Validation of a Deep Neural Network Model for Prediction of Postoperative In-hospital Mortality

What We Already Know about This Topic Robust predictions are required to compare perioperative mortality among hospitals Deep neural network systems, a type of machine learning, can be used to develop highly nonlinear prediction models What This Article Tells Us That Is New The authors’ neural network model was comparable in accuracy to, but potentially more efficient at feature selection than logistic regression models Deep neural network–based machine learning provides an alternative to conventional multivariate regression Background: The authors tested the hypothesis that deep neural networks trained on intraoperative features can predict postoperative in-hospital mortality. Methods: The data used to train and validate the algorithm consists of 59,985 patients with 87 features extracted at the end of surgery. Feed-forward networks with a logistic output were trained using stochastic gradient descent with momentum. The deep neural networks were trained on 80% of the data, with 20% reserved for testing. The authors assessed improvement of the deep neural network by adding American Society of Anesthesiologists (ASA) Physical Status Classification and robustness of the deep neural network to a reduced feature set. The networks were then compared to ASA Physical Status, logistic regression, and other published clinical scores including the Surgical Apgar, Preoperative Score to Predict Postoperative Mortality, Risk Quantification Index, and the Risk Stratification Index. Results: In-hospital mortality in the training and test sets were 0.81% and 0.73%. The deep neural network with a reduced feature set and ASA Physical Status classification had the highest area under the receiver operating characteristics curve, 0.91 (95% CI, 0.88 to 0.93). The highest logistic regression area under the curve was found with a reduced feature set and ASA Physical Status (0.90, 95% CI, 0.87 to 0.93). The Risk Stratification Index had the highest area under the receiver operating characteristics curve, at 0.97 (95% CI, 0.94 to 0.99). Conclusions: Deep neural networks can predict in-hospital mortality based on automatically extractable intraoperative data, but are not (yet) superior to existing methods.

[1]  A. Mahajan,et al.  A Systematic Approach to Creation of a Perioperative Data Warehouse , 2016, Anesthesia and analgesia.

[2]  Pierre Baldi,et al.  Learning to Predict Chemical Reactions , 2011, J. Chem. Inf. Model..

[3]  H. Bitterman,et al.  Predicting 30-Day Readmissions With Preadmission Electronic Health Record Data , 2015, Medical care.

[4]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[5]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[6]  Lin Wu,et al.  A Scalable Machine Learning Approach to Go , 2006, NIPS.

[7]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[8]  Pierre Baldi,et al.  Deep architectures for protein contact map prediction , 2012, Bioinform..

[9]  Nilmini Wickramasinghe,et al.  Deepr: A Convolutional Net for Medical Records , 2016, ArXiv.

[10]  Mary R. Kwaan,et al.  An Apgar score for surgery. , 2007, Journal of the American College of Surgeons.

[11]  Gilles Clermont,et al.  Modelling Risk of Cardio-Respiratory Instability as a Heterogeneous Process , 2015, AMIA.

[12]  Rd Dripps,et al.  New classification of physical status , 1963 .

[13]  Pierre Baldi,et al.  Deep Learning, Dark Knowledge, and Dark Matter , 2014, HEPML@NIPS.

[14]  Paolo Pelosi,et al.  Mortality after surgery in Europe: a 7 day cohort study , 2012, The Lancet.

[15]  Pierre Baldi,et al.  The Principled Design of Large-Scale Recursive Neural Network Architectures--DAG-RNNs and the Protein Structure Prediction Problem , 2003, J. Mach. Learn. Res..

[16]  Lin Wu,et al.  Learning to play Go using recursive neural networks , 2008, Neural Networks.

[17]  Jonathan S Schildcrout,et al.  Expansion of the Surgical Apgar Score across All Surgical Subspecialties as a Means to Predict Postoperative Mortality , 2011, Anesthesiology.

[18]  Jonathan P Wanderer,et al.  Validation of a Risk Stratification Index and Risk Quantification Index for Predicting Patient Outcomes: In-hospital Mortality, 30-day Mortality, 1-year Mortality, and Length-of-stay , 2013, Anesthesiology.

[19]  Wei Luo,et al.  Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View , 2016, Journal of medical Internet research.

[20]  Pierre Baldi,et al.  ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning , 2012, J. Chem. Inf. Model..

[21]  Pierre Baldi,et al.  The dropout learning algorithm , 2014, Artif. Intell..

[22]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[23]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[24]  David A Harrison,et al.  Identification and characterisation of the high-risk surgical population in the United Kingdom , 2006, Critical care.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Pierre Baldi,et al.  Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules , 2013, J. Chem. Inf. Model..

[27]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28]  Paul Landais,et al.  Preoperative Score to Predict Postoperative Mortality (POSPOM): Derivation and Validation , 2016, Anesthesiology.

[29]  Gilles Clermont,et al.  Learning temporal rules to forecast instability in continuously monitored patients , 2017, J. Am. Medical Informatics Assoc..

[30]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[31]  Li Liang,et al.  Prediction of 30-Day All-Cause Readmissions in Patients Hospitalized for Heart Failure: Comparison of Machine Learning and Other Statistical Approaches , 2017, JAMA cardiology.

[32]  Armin Schubert,et al.  Broadly Applicable Risk Stratification System for Predicting Duration of Hospitalization and Mortality , 2010, Anesthesiology.

[33]  Stuart R Lipsitz,et al.  Surgical outcome measurement for a global patient population: validation of the Surgical Apgar Score in 8 countries. , 2011, Surgery.

[34]  Jesse M. Ehrenfeld,et al.  Utility of the surgical apgar score: validation in 4119 patients. , 2009, Archives of surgery.

[35]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[37]  David Sontag,et al.  Temporal Convolutional Neural Networks for Diagnosis from Lab Tests , 2015, ArXiv.

[38]  Pierre Baldi,et al.  Neural Networks for Fingerprint Recognition , 1993, Neural Computation.

[39]  W. Berry,et al.  An estimation of the global volume of surgery: a modelling strategy based on available data , 2008, The Lancet.

[40]  Preoperative Surgical Risk Predictions Are Not Meaningfully Improved by Including the Surgical Apgar Score: An Analysis of the Risk Quantification Index and Present-On-Admission Risk Models , 2015, Anesthesiology.

[41]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[42]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[43]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  P. Baldi,et al.  Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.