Comparison of the Mortality Probability Admission Model III, National Quality Forum, and Acute Physiology and Chronic Health Evaluation IV Hospital Mortality Models: Implications for National Benchmarking*

Objective:To examine the accuracy of the original Mortality Probability Admission Model III, ICU Outcomes Model/National Quality Forum modification of Mortality Probability Admission Model III, and Acute Physiology and Chronic Health Evaluation IVa models for comparing observed and risk-adjusted hospital mortality predictions. Design:Retrospective paired analyses of day 1 hospital mortality predictions using three prognostic models. Setting:Fifty-five ICUs at 38 U.S. hospitals from January 2008 to December 2012. Patients:Among 174,001 intensive care admissions, 109,926 met model inclusion criteria and 55,304 had data for mortality prediction using all three models. Interventions:None. Measurements and Main Results:We compared patient exclusions and the discrimination, calibration, and accuracy for each model. Acute Physiology and Chronic Health Evaluation IVa excluded 10.7% of all patients, ICU Outcomes Model/National Quality Forum 20.1%, and Mortality Probability Admission Model III 24.1%. Discrimination of Acute Physiology and Chronic Health Evaluation IVa was superior with area under receiver operating curve (0.88) compared with Mortality Probability Admission Model III (0.81) and ICU Outcomes Model/National Quality Forum (0.80). Acute Physiology and Chronic Health Evaluation IVa was better calibrated (lowest Hosmer-Lemeshow statistic). The accuracy of Acute Physiology and Chronic Health Evaluation IVa was superior (adjusted Brier score = 31.0%) to that for Mortality Probability Admission Model III (16.1%) and ICU Outcomes Model/National Quality Forum (17.8%). Compared with observed mortality, Acute Physiology and Chronic Health Evaluation IVa overpredicted mortality by 1.5% and Mortality Probability Admission Model III by 3.1%; ICU Outcomes Model/National Quality Forum underpredicted mortality by 1.2%. Calibration curves showed that Acute Physiology and Chronic Health Evaluation performed well over the entire risk range, unlike the Mortality Probability Admission Model and ICU Outcomes Model/National Quality Forum models. Acute Physiology and Chronic Health Evaluation IVa had better accuracy within patient subgroups and for specific admission diagnoses. Conclusions:Acute Physiology and Chronic Health Evaluation IVa offered the best discrimination and calibration on a large common dataset and excluded fewer patients than Mortality Probability Admission Model III or ICU Outcomes Model/National Quality Forum. The choice of ICU performance benchmarks should be based on a comparison of model accuracy using data for identical patients.

[1]  J. Zimmerman,et al.  Assessing the calibration of mortality benchmarks in critical care: The Hosmer-Lemeshow test revisited* , 2007, Critical care medicine.

[2]  James R Carpenter,et al.  A new risk prediction model for critical care: The Intensive Care National Audit & Research Centre (ICNARC) model* , 2007, Critical care medicine.

[3]  Mitzi L. Dean,et al.  Variation in ICU risk-adjusted mortality: impact of methods of assessment and potential confounders. , 2008, Chest.

[4]  M. Keegan,et al.  Severity of illness scoring systems in the intensive care unit , 2011, Critical care medicine.

[5]  Omar Badawi,et al.  Severity scoring in the critically ill: part 1--interpretation and accuracy of outcome prediction scoring systems. , 2012, Chest.

[6]  Niels Peek,et al.  The impact of different prognostic models and their customization on institutional comparison of intensive care units* , 2007, Critical care medicine.

[7]  S Lemeshow,et al.  Effect of changing patient mix on the performance of an intensive care unit severity-of-illness model: how to distinguish a general from a specialty intensive care unit. , 1996, Critical care medicine.

[8]  D. Zandstra,et al.  The use of intensive care information systems alters outcome prediction , 1998, Intensive Care Medicine.

[9]  D. Teres,et al.  Assessing contemporary intensive care unit outcome: An updated Mortality Probability Admission Model (MPM0-III)* , 2007, Critical care medicine.

[10]  Bekele Afessa,et al.  Comparison of APACHE III, APACHE IV, SAPS 3, and MPM0III and influence of resuscitation status on model performance. , 2012, Chest.

[11]  J. Zimmerman,et al.  Acute Physiology and Chronic Health Evaluation (APACHE) IV: Hospital mortality assessment for today’s critically ill patients* , 2006, Critical care medicine.

[12]  H. Wunsch,et al.  Impact of exclusion criteria on case mix, outcome, and length of stay for the severity of disease scoring methods in common use in critical care. , 2004, Journal of critical care.

[13]  N. Cook Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction , 2007, Circulation.

[14]  P. Metnitz,et al.  Characterizing performance profiles of ICUs , 2010, Current opinion in critical care.

[15]  D. Wagner,et al.  Veterans Affairs intensive care unit risk adjustment model: Validation, updating, recalibration* , 2008, Critical care medicine.

[16]  R. Dang,et al.  Comparison of newer scoring systems with the conventional scoring systems in general intensive care population. , 2012, Minerva anestesiologica.

[17]  Jenny Wong,et al.  Synthesis of the oxysterol, 24(S), 25-epoxycholesterol, parallels cholesterol production and may protect against cellular accumulation of newly-synthesized cholesterol , 2007, Lipids in Health and Disease.

[18]  A. Kari,et al.  Association of automated data collection and data completeness with outcomes of intensive care. A new customised model for outcome prediction , 2012, Acta anaesthesiologica Scandinavica.

[19]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[20]  Minna Niskanen,et al.  Case-mix-adjusted length of stay and mortality in 23 Finnish ICUs , 2009, Intensive Care Medicine.

[21]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[22]  J. Vincent,et al.  Clinical review: Scoring systems in the critically ill , 2010, Critical care.

[23]  P. Metnitz,et al.  Characterizing the risk profiles of intensive care units , 2010, Intensive Care Medicine.

[24]  D. Teres,et al.  Prospective validation of the intensive care unit admission Mortality Probability Model (MPM0-III)* , 2009, Critical care medicine.

[25]  D. Teres,et al.  Subgroup mortality probability models: Are they necessary for specialized intensive care units?* , 2009, Critical care medicine.

[26]  Peter Bauer,et al.  Austrian validation and customization of the SAPS 3 Admission Score , 2009, Intensive Care Medicine.

[27]  T. Osler,et al.  Effect of varying the case mix on the standardized mortality ratio and W statistic: A simulation study. , 2000, Chest.

[28]  Ameen Abu-Hanna,et al.  A comparison of the performance of a model based on administrative data and a model based on clinical data: Effect of severity of illness on standardized mortality ratios of intensive care units* , 2012, Critical care medicine.

[29]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[30]  Omar Badawi,et al.  Severity scoring in the critically ill: part 2: maximizing value from outcome prediction scoring systems. , 2012, Chest.

[31]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[32]  James R Carpenter,et al.  Recalibration of risk prediction models in a large multicenter cohort of admissions to adult, general critical care units in the United Kingdom* , 2006, Critical care medicine.

[33]  Sunil Prabhakar,et al.  Chapter 15 , 2001 .