New models for old questions: generalized linear models for cost prediction.

BACKGROUND Generalized linear models (GLMs) have recently been introduced into cost data analysis. GLMs, transformations of the linear regression model, are characterized by a particular response distribution from one of the exponential family of distributions and monotonic link function which relates the response mean to a scale on which additive model effects operate. OBJECTIVES This study compared GLMs and ordinary least squares regression (OLS) in predicting individual patient costs in adult intensive care units (ICUs) and sought to define the utility of the inverse Gaussian distribution family within GLMs. METHODS A prospective 'ground-up' utilization costing study was performed in three adult university associated ICUs, enrolling consecutive ICU admissions over a 6-month period in 1991. ICU utilization, patient demographic and ICU admission day data were recorded by dedicated data collectors. Model performance was assessed by prediction error [mean absolute error (MAE), root mean squared error (RMSE)] and residual analysis. RESULTS The cohort, 1098 patients surviving ICU, was of mean (SD) age 56 (19.5) years and 41% female. Patient costs per ICU episode (1991 A$) were A$6311 (9689), with range A$106 to A$95602. Prediction error for mean costs was minimal (MAE 4780; RMSE 8965) with OLS using heteroscedastic retransformation of log costs and GLM with Gaussian family and log link (MAE 4798; RMSE 8907). Residual analysis suggested optimal overall performance for the above two models and a GLM with inverse Gaussian family and log link. CONCLUSIONS Traditional cost models of OLS with (log) cost transformation may be supplemented by appropriately specified GLM which more closely model the error structure.

[1]  P Royston,et al.  The use of fractional polynomials to model continuous risk variables in epidemiology. , 1999, International journal of epidemiology.

[2]  Peter C Austin,et al.  A comparison of several regression models for analysing cost of CABG surgery , 2003, Statistics in medicine.

[3]  J. Lindsey,et al.  Choosing among generalized linear models applied to medical data. , 1998, Statistics in medicine.

[4]  Michael John Smith,et al.  Dealing with skewed data: an example using asthma-related costs of medicaid clients , 2001 .

[5]  A. R. Peisach,et al.  Cost Calculation and Prediction in Adult Intensive Care: A Ground-up Utilization Study , 2004, Anaesthesia and intensive care.

[6]  Transformations and R 2 , 1991 .

[7]  D G Altman,et al.  Transfer of technology from statistical journals to the biomedical literature. Past trends and future predictions. , 1994, JAMA.

[8]  W. Al Research in physical medicine and rehabilitation. , 1952 .

[9]  Scott D. Ramsey,et al.  Using Generalized Linear Models to Assess Medical Care Costs , 2000, Health Services and Outcomes Research Methodology.

[10]  J. Leslie The Inverse Gaussian Distribution: Theory, Methodology, and Applications , 1990 .

[11]  William M. Tierney,et al.  Regression analysis of health care charges with heteroscedasticity , 2001 .

[12]  S. Thompson,et al.  Multiple regression of cost data: use of generalised linear models , 2004, Journal of health services research & policy.

[13]  N. Breslow,et al.  Generalized Linear Models: Checking Assumptions and Strengthening Conclusions , 2022 .

[14]  Eric R. Ziegel,et al.  An Introduction to Generalized Linear Models , 2002, Technometrics.

[15]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[16]  J. Rapoport,et al.  Economies of scale in British intensive care units and combined intensive care/high dependency units , 2004, Intensive Care Medicine.

[17]  W. Knaus The APACHE III Prognostic System , 1992 .

[18]  P. Solomon,et al.  Phosphate metabolism in intensive care patients with acute respiratory failure. , 2002, Critical care and resuscitation : journal of the Australasian Academy of Critical Care Medicine.

[19]  E. van Doorslaer,et al.  Statistical analysis of cost outcomes in a randomized controlled clinical trial. , 1994, Health Economics.

[20]  T. Cole,et al.  Sympercents: symmetric percentage differences on the 100 log(e) scale simplify the presentation of log transformed data. , 2000, Statistics in medicine.

[21]  D Y Lin,et al.  Methods for analyzing health care utilization and costs. , 1999, Annual review of public health.

[22]  N. Duan Smearing Estimate: A Nonparametric Retransformation Method , 1983 .

[23]  W. Knaus,et al.  The APACHE III prognostic system. Risk prediction of hospital mortality for critically ill hospitalized adults. , 1991, Chest.

[24]  Peter W. Lane,et al.  Generalized linear models in soil science , 2002 .

[25]  D. Coyle Statistical analysis in pharmacoeconomic studies. A review of current issues and standards. , 1996, PharmacoEconomics.

[26]  J. Lipman,et al.  The Costs of Caring for Patients in a Tertiary Referral Australian Intensive Care Unit , 2005, Anaesthesia and intensive care.

[27]  S. Green How Many Subjects Does It Take To Do A Regression Analysis. , 1991, Multivariate behavioral research.

[28]  M. Angermeyer,et al.  A comparison of methods to handle skew distributed cost variables in the analysis of the resource consumption in schizophrenia treatment. , 2002, The journal of mental health policy and economics.

[29]  W. Knaus,et al.  The use of APACHE III to evaluate ICU length of stay, resource use, and mortality after coronary artery by-pass surgery. , 1995, The Journal of cardiovascular surgery.

[30]  B. Zheng,et al.  Summarizing the goodness of fit of generalized linear models for longitudinal data. , 2000, Statistics in medicine.

[31]  M C Hornbrook,et al.  Modeling risk using generalized linear models. , 1999, Journal of health economics.

[32]  A Agresti,et al.  Summarizing the predictive power of a generalized linear model. , 2000, Statistics in medicine.

[33]  A. Lee,et al.  A discordancy test approach to identify outliers of length of hospital stay. , 1998, Statistics in medicine.

[34]  R. H. Myers,et al.  A TUTORIAL ON GENERALIZED LINEAR MODELS , 1997 .

[35]  J. Lellouch,et al.  Explanatory and pragmatic attitudes in therapeutical trials. , 1967, Journal of chronic diseases.

[36]  R. Fetter,et al.  Case mix definition by diagnosis-related groups. , 1980, Medical care.

[37]  T. Cole,et al.  Sympercents: symmetric percentage differences on the 100 log(e) scale simplify the presentation of log transformed data. , 2002, Statistics in medicine.

[38]  G. Whitmore,et al.  The inverse Gaussian distribution as a model of hospital stay. , 1975, Health services research.

[39]  W. Manning,et al.  Estimating Log Models: To Transform or Not to Transform? , 1999, Journal of health economics.

[40]  W. Greene,et al.  计量经济分析 = Econometric analysis , 2009 .

[41]  W. Manning,et al.  The logged dependent variable, heteroscedasticity, and the retransformation problem. , 1998, Journal of health economics.

[42]  A Briggs,et al.  The Distribution of Health Care Costs and Their Statistical Analysis for Economic Evaluation , 1998, Journal of health services research & policy.

[43]  J. Hardin,et al.  Generalized Linear Models and Extensions , 2001 .

[44]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[45]  K. Koehler,et al.  Probability Plots and Distribution Curves for Assessing the Fit of Probability Models , 1991 .

[46]  T. Findley,et al.  Research in Physical Medicine and Rehabilitation: VI. Research Project Management , 1989, American journal of physical medicine & rehabilitation.

[47]  J Lipscomb,et al.  Comparison of analytic models for estimating the effect of clinical factors on the cost of coronary artery bypass graft surgery. , 1993, Journal of clinical epidemiology.

[48]  J Carpenter,et al.  Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. , 2000, Statistics in medicine.

[49]  B. Guidet,et al.  Estimation of direct cost and resource allocation in intensive care: correlation with Omega system , 1998, Intensive Care Medicine.

[50]  D. Buchner,et al.  Research in Physical Medicine and Rehabilitation: VIII. Preliminary Data Analysis , 1990, American journal of physical medicine & rehabilitation.