Unplanned 30-Day Readmissions: Hospital Data Warehouse Modelling.

Background: Unplanned hospital readmissions are a major healthcare and economic burden. This study compared statistical methods and machine learning algorithms for predicting the risk of all-cause 30-day hospital readmission in two French academic hospitals.Methods: The dataset included hospital stays selected from the clinical data warehouses (CDW) of the two hospitals (Rennes and Tours Academic Hospitals) using the criteria of the French national methodology to measure the 30-day readmission rate (i.e. ≥18-year-old patients, geolocation, no iterative stays, and no hospitalization for palliative care). Then, the prediction performance of Logistic Regression, Naive Bayes, Gradient Boosting, Random Forest, and Neural Networks were compared separately for the two hospitals but using the same CDW data pre-processing for all algorithms. The area under the receiver operating characteristic curve (AUC) was calculated for the 30-day readmission prediction performance of each model as well as the time to train the algorithm.Results: In total, 259,092 and 197,815 stays were included from the Rennes and Tours Academic Hospital CDWs, respectively, with readmission rates of 8.8% (Rennes) and 9.5% (Tours). The AUC of the regression models for the two hospitals ranged from 0.61 to 0.64, with computation times exceeding 18 hours. The AUC of the machine learning models ranged from 0.61 to 0.69 with computation times below 13 hours.Conclusions: Better performance and shorter computation times are obtained with machine learning methods. It is still necessary to compare different algorithms to identify the most efficient model.