Hazard function estimation using B-splines.

A flexible parametric procedure is given to model the hazard function as a linear combination of cubic B-splines and to obtain maximum likelihood estimates from censored survival data. The approach yields smooth estimates of the hazard and survivorship functions that are intermediate in structure between strongly parametric and non-parametric models. A simple method is described for selecting the number and location of knots. Simulation results show favorable root mean square error compared to non-parametric estimates for both the hazard and survivorship functions. Three methods are given to calculate confidence intervals based on the delta method, profile likelihood, and bootstrap, respectively. The procedure is applied to estimate hazard rates for acquired immunodeficiency syndrome (AIDS) following infection with human immunodeficiency virus (HIV). Spline methods can accommodate complex censoring mechanisms such as those that arise in the AIDS setting. To illustrate, HIV infection incidence is estimated for a cohort of hemophiliacs in which the dates of HIV infection are interval-censored and some subjects were born after the onset of the HIV epidemic.

[1]  B. Turnbull The Empirical Distribution Function with Arbitrarily Grouped, Censored, and Truncated Data , 1976 .

[2]  J. Friedman,et al.  FLEXIBLE PARSIMONIOUS SMOOTHING AND ADDITIVE MODELING , 1989 .

[3]  D. Harrington,et al.  Regression Splines in the Cox Model with Application to Covariate Effects in Liver Disease , 1990 .

[4]  Robert Gray,et al.  Flexible Methods for Analyzing Survival Data Using Splines, with Applications to Breast Cancer Prognosis , 1992 .

[5]  J. Goedert,et al.  A prospective study of human immunodeficiency virus type 1 infection and the development of AIDS in subjects with hemophilia. , 1989, The New England journal of medicine.

[6]  James J. Goedert,et al.  Effect of age at seroconversion on the natural AIDS incubation distribution , 1994, AIDS.

[7]  A. Whittemore,et al.  Survival estimation using splines. , 1986, Biometrics.

[8]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[9]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[10]  J J Goedert,et al.  HIV-1 infection incidence among persons with hemophilia in the United States and western Europe, 1978-1990. Multicenter Hemophilia Cohort Study. , 1994, Journal of acquired immune deficiency syndromes.

[11]  Antonio Ciampi,et al.  Extended hazard regression for censored survival data with covariates : a spline approximation for the baseline hazard function , 1987 .

[12]  J. Anderson,et al.  Smooth Estimates for the Hazard Function , 1980 .

[13]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[14]  R. Simon,et al.  Flexible regression models with cubic splines. , 1989, Statistics in medicine.

[15]  Wing Hung Wong,et al.  Data-Based Nonparametric Estimation of the Hazard Function with Applications to Model Diagnostics and Exploratory Analysis , 1984 .

[16]  M. Akritas Bootstrapping the Kaplan—Meier Estimator , 1986 .

[17]  R. G. Miller,et al.  What price Kaplan-Meier? , 1983, Biometrics.