Semiparametric Estimation of Longitudinal Medical Cost Trajectory

ABSTRACT Estimating the average monthly medical costs from disease diagnosis to a terminal event such as death for an incident cohort of patients is a topic of immense interest to researchers in health policy and health economics because patterns of average monthly costs over time reveal how medical costs vary across phases of care. The statistical challenges to estimating monthly medical costs longitudinally are multifold; the longitudinal cost trajectory (formed by plotting the average monthly costs from diagnosis to the terminal event) is likely to be nonlinear, with its shape depending on the time of the terminal event, which can be subject to right censoring. The goal of this article is to tackle this statistically challenging topic by estimating the conditional mean cost at any month t given the time of the terminal event s. The longitudinal cost trajectories with different terminal event times form a bivariate surface of t and s, under the constraint t ⩽ s. We propose to estimate this surface using bivariate penalized splines in an expectation-maximization algorithm that treats the censored terminal event times as missing data. We evaluate the proposed model and estimation method in simulations and apply the method to the medical cost data of an incident cohort of stage IV breast cancer patients from the Surveillance, Epidemiology, and End Results–Medicare Linked Database. Supplementary materials for this article are available online.

[1]  Daowen Zhang,et al.  A flexible model for correlated medical costs, with application to medical expenditure panel survey data , 2016, Statistics in medicine.

[2]  D. Ruppert,et al.  Transformation and Weighting in Regression , 1988 .

[3]  Ruth Etzioni,et al.  Estimating the costs attributable to a disease with application to ovarian cancer. , 1996, Journal of clinical epidemiology.

[4]  Anthony O'Hagan,et al.  On estimators of medical costs with censored data. , 2004, Journal of health economics.

[5]  Mei-Cheng Wang,et al.  BACKWARD ESTIMATION OF STOCHASTIC PROCESSES WITH FAILURE EVENTS AS TIME ORIGINS. , 2010, The annals of applied statistics.

[6]  H Zhao,et al.  Some insight on censored cost estimators , 2011, Statistics in medicine.

[7]  Chen Zuo,et al.  Nonparametric Inference for Median Costs with Censored Data , 2012, Biometrics.

[8]  H. Bang Medical cost analysis: application to colorectal cancer data from the SEER Medicare database. , 2005, Contemporary clinical trials.

[9]  Rob J Hyndman,et al.  Mixed Model-Based Hazard Estimation , 2002 .

[10]  Robin Henderson,et al.  Joint modelling of repeated measurements and time-to-event outcomes: flexible model specification and exact likelihood inference , 2014, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[11]  Lei Liu,et al.  A shared random effects model for censored medical costs and mortality , 2007, Statistics in medicine.

[12]  Anastasios A. Tsiatis,et al.  Estimating medical costs with censored data , 2000 .

[13]  D. Rubin,et al.  Ignorability and Coarse Data , 1991 .

[14]  Angela Mariotto,et al.  Cost of care for elderly cancer patients in the United States. , 2008, Journal of the National Cancer Institute.

[15]  Daowen Zhang,et al.  A flexible model for the mean and variance functions, with application to medical cost data , 2013, Statistics in medicine.

[16]  David Ruppert,et al.  Semiparametric regression during 2003-2007. , 2009, Electronic journal of statistics.

[17]  Ruth Etzioni,et al.  Estimating Health Care Costs Related to Cancer Treatment From SEER-Medicare Data , 2002, Medical care.

[18]  Konstantin G. Arbeev,et al.  Medical Cost Trajectories and Onsets of Cancer and NonCancer Diseases in US Elderly Population , 2011, Comput. Math. Methods Medicine.

[19]  William A. Knaus,et al.  A random effects four-part model, with application to correlated medical costs , 2008, Comput. Stat. Data Anal..

[20]  Eric J Feuer,et al.  Projections of the cost of cancer care in the United States: 2010-2020. , 2011, Journal of the National Cancer Institute.

[21]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[22]  David Ruppert,et al.  On the asymptotics of penalized spline smoothing , 2011 .

[23]  D. Lin,et al.  Linear regression analysis of censored medical costs. , 2000, Biostatistics.

[24]  S. Dumont,et al.  The trajectory of palliative care costs over the last 5 months of life: a Canadian longitudinal study , 2010, Palliative medicine.

[25]  Geoffrey J. McLachlan,et al.  Analyzing Microarray Gene Expression Data , 2004 .

[26]  Lei Liu,et al.  Analysis of Longitudinal Data in the Presence of Informative Observational Times and a Dependent Terminal Event, with Application to Medical Cost Data , 2008, Biometrics.

[27]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[28]  E. Feuer,et al.  Estimating medical costs from incomplete follow-up data. , 1997, Biometrics.

[29]  E. Feuer,et al.  Projections of the costs associated with colorectal cancer care in the United States, 2000-2020. , 2008, Health economics.

[30]  P. Diggle Analysis of Longitudinal Data , 1995 .

[31]  Angela Mariotto,et al.  Comparison of Approaches for Estimating Incidence Costs of Care for Colorectal Cancer Patients , 2009, Medical care.

[32]  Lei Liu,et al.  Joint modeling longitudinal semi‐continuous data and survival, with application to longitudinal medical cost data , 2009, Statistics in medicine.

[33]  D. Ruppert,et al.  On the asymptotics of penalized splines , 2008 .

[34]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[35]  Joseph C Gardiner,et al.  Longitudinal analysis of censored medical cost data. , 2006, Health economics.

[36]  H Zhao,et al.  On Estimating Medical Cost and Incremental Cost‐Effectiveness Ratios with Censored Data , 2001, Biometrics.

[37]  D. Lin,et al.  Regression analysis of incomplete medical cost data , 2003, Statistics in medicine.

[38]  P. Hall,et al.  Theory for penalised spline regression , 2005 .

[39]  David Ruppert,et al.  Transformation and Weighting , 2014 .