A spline‐based semiparametric sieve likelihood method for over‐dispersed panel count data

In this article we study a Gamma-Frailty inhomogeneous Poisson process model for analysing over-dispersed panel count data. A cubic B-spline function is used to approximate the logarithm of the baseline mean function in the semiparametric proportional mean model. The regression parameters and spline coefficients are jointly estimated by maximizing a spline-based sieve pseudo-likelihood and by replacing the nuisance over-dispersion parameter with its moment estimate. The asymptotic properties of the proposed maximum pseudo likelihood estimator, including its consistency, convergence rate and the asymptotic normality of the estimated regression parameters, are thoroughly studied using modern empirical process theory. A spline-based least-squares standard error estimator is developed to facilitate robust inference for the regression parameters. Simulation studies are conducted to investigate finite sample performance of the proposed method and robustness of the Gamma-Frailty inhomogeneous Poisson process model. Finally, for illustration, the method is used to analyse data from an observational study of sexually transmitted infection (STI) in young women. The Canadian Journal of Statistics 42: 217–245; 2014 © 2014 Statistical Society of Canada Resume Dans cet article, les auteurs etudient un modele pour des donnees de denombrement surdispersees en panel base sur un processus de Poisson inhomogene a fragilite gamma. Ils utilisent une B-spline cubique pour approximer le logarithme de la fonction de reference dans le modele semiparametrique a moyennes proportionnelles. Les parametres de regression et les coefficients des splines sont estimes conjointement en maximisant la pseudo-vraisemblance en tamis et en remplacant le parametre nuisible de surdispersion par son estimateur des moments. Les proprietes asymptotiques de l'estimateur au maximum de pseudo-vraisemblance propose, y compris sa convergence, son taux de convergence et la normalite asymptotique des parametres de regression estimes, sont examines en detail a l'aide de la theorie moderne des processus empiriques. Les auteurs developpent un estimateur aux moindres carres de l’ecart-type fonde sur les splines qui facilite l'inference robuste des parametres de regression. Ils procedent a des etudes de simulation pour examiner la performance de la methode proposee avec un echantillon fini, ainsi que la robustesse du modele de Poisson inhomogene a fragilite gamma. Ils illustrent egalement leur methode par l'analyse de donnees provenant d'une etude d'observation portant sur les infections transmissibles sexuellement (ITS) chez les jeunes femmes. La revue canadienne de statistique 42: 217–245; 2014 © 2014 Societe statistique du Canada

[1]  Lisa M. Ganio,et al.  Diagnostics for Overdispersion , 1992 .

[2]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[3]  B. Yawn,et al.  Screening for Chlamydial Infection U.S. Preventive Services Task Force Recommendation Statement , 2007, Annals of Internal Medicine.

[4]  S. Geman,et al.  Nonparametric Maximum Likelihood Estimation by the Method of Sieves , 1982 .

[5]  D. Byar,et al.  Comparisons of placebo, pyridoxine, and topical thiotepa in preventing recurrence of stage I bladder cancer. , 1977, Urology.

[6]  Norman E. Breslow,et al.  Tests of Hypotheses in Overdispersed Poisson Regression and other Quasi-Likelihood Models , 1990 .

[7]  W. Wong,et al.  Convergence Rate of Sieve Estimates , 1994 .

[8]  Jon A. Wellner,et al.  TWO LIKELIHOOD-BASED SEMIPARAMETRIC ESTIMATION METHODS FOR PANEL COUNT DATA WITH COVARIATES , 2005, math/0509132.

[9]  J. Buehler,et al.  Condom Use and Risk of Gonorrhea and Chlamydia: A Systematic Review of Design and Measurement Factors Assessed in Epidemiologic Studies , 2006, Sexually transmitted diseases.

[10]  J. Fortenberry,et al.  Time from first intercourse to first sexually transmitted infection diagnosis among adolescent women. , 2009, Archives of pediatrics & adolescent medicine.

[11]  Ying Zhang,et al.  A semiparametric pseudolikelihood estimation method for panel count data , 2002 .

[12]  Xingqiu Zhao,et al.  Semiparametric Regression Analysis of Longitudinal Data With Informative Observation Times , 2005 .

[13]  J. Fortenberry,et al.  Repeated Chlamydia trachomatis genital infections in adolescent women. , 2010, The Journal of infectious diseases.

[14]  Zhiliang Ying,et al.  Semiparametric regression for the mean and rate functions of recurrent events , 2000 .

[15]  S. Zeger A regression model for time series of counts , 1988 .

[16]  A. V. D. Vaart,et al.  Asymptotic Statistics: Frontmatter , 1998 .

[17]  Ying Zhang,et al.  Nonparametric k-sample tests with panel count data , 2006 .

[18]  Jerald F. Lawless,et al.  Some Simple Robust Methods for the Analysis of Recurrent Events , 1995 .

[19]  Lee-Jen Wei,et al.  Regression analysis of panel count data with covariate‐dependent observation and censoring times , 2000 .

[20]  Pulak Ghosh,et al.  A stochastic model for assessing Chlamydia trachomatis transmission risk by using longitudinal observational data , 2011, Journal of the Royal Statistical Society. Series A,.

[21]  Ying Zhang,et al.  Semiparametric Estimation Methods for Panel Count Data Using Monotone B-Splines , 2009 .

[22]  Wanzhu Tu,et al.  Assessing Sexual Attitudes and Behaviors of Young Women: A Joint Model with Nonlinear Time Effects, Time Varying Covariates, and Dropouts , 2009, Journal of the American Statistical Association.

[23]  David R. Cox,et al.  Some remarks on overdispersion , 1983 .

[24]  Ying Zhang,et al.  Spline-based semiparametric projected generalized estimating equation method for panel count data. , 2012, Biostatistics.