Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study

Summary Two-part random effects models (Olsen and Schafer, 1 Tooze et al. 2 ) have been applied to repeated measures of semi-continuous data, characterized by a mixture of a substantial proportion of zero values and a skewed distribution of positive values. In the original formulation of this model, the natural logarithm of the positive values is assumed to follow a normal distribution with a constant variance parameter. In this article, we review and consider three extensions of this model, allowing the positive values to follow (a) a generalized gamma distribution, (b) a log-skew-normal distribution, and (c) a normal distribution after the Box-Cox transformation. We allow for the possibility of heteroscedasticity. Maximum likelihood estimation is shown to be conveniently implemented in SAS Proc NLMIXED. The performance of the methods is compared through applications to daily drinking records in a secondary data analysis from a randomized controlled trial of topiramate for alcohol dependence treatment. We find that all three models provide a significantly better fit than the log-normal model, and there exists strong evidence for heteroscedasticity. We also compare the three models by the likelihood ratio tests for non-nested hypotheses (Vuong 3 ). The results suggest that the generalized gamma distribution provides the best fit, though no statistically significant differences are found in pairwise model comparisons.

[1]  S. Merhar,et al.  Letter to the editor , 2005, IEEE Communications Magazine.

[2]  Marcia M Ward,et al.  Effect of critical access hospital conversion on patient safety. , 2007, Health services research.

[3]  Lei Liu,et al.  A multi‐level two‐part random effects model, with application to an alcohol‐dependence study , 2008, Statistics in medicine.

[4]  D. Ciraulo,et al.  Topiramate for treating alcohol dependence: a randomized controlled trial. , 2007, JAMA.

[5]  Lei Liu,et al.  A likelihood reformulation method in non‐normal random effects models , 2008, Statistics in medicine.

[6]  E. Frees,et al.  Heavy-tailed longitudinal data modeling using copulas , 2008 .

[7]  Linda C. Sobell,et al.  Timeline Follow-Back A Technique for Assessing Self-Reported Alcohol Consumption , 1992 .

[8]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[9]  Jeng-Min Chiou,et al.  Quasi-Likelihood Regression with Unknown Link and Variance Functions , 1998 .

[10]  Berk Kn,et al.  Repeated measures with zeros. , 2002 .

[11]  Bo Bjerre,et al.  A Swedish alcohol ignition interlock programme for drink-drivers: effects on hospital care utilization and sick leave. , 2007, Addiction.

[12]  J. Twisk,et al.  Longitudinal tobit regression: a new approach to analyze outcome variables with floor or ceiling effects. , 2009, Journal of clinical epidemiology.

[13]  Irene A. Stegun,et al.  Handbook of Mathematical Functions. , 1966 .

[14]  Anirban Basu,et al.  Generalized Modeling Approaches to Risk Adjustment of Skewed Outcomes Data , 2003, Journal of health economics.

[15]  H. Chai,et al.  Use of log‐skew‐normal distribution in analysis of continuous data with a discrete component at zero , 2008, Statistics in medicine.

[16]  R. Rigby,et al.  Generalized additive models for location, scale and shape , 2005 .

[17]  Robert L. Strawderman,et al.  Bayesian Inference for a Two-Part Hierarchical Model , 2006 .

[18]  Joseph L Schafer,et al.  A Two-Part Random-Effects Model for Semicontinuous Longitudinal Data , 2001 .

[19]  Shou-En Lu,et al.  Analyzing Excessive No Changes in Clinical Trials with Clustered Data , 2004, Biometrics.

[20]  P. Albert Comment on Lu, et. al. 2004: Analyzing excessive no changes in clinical trials with clustered data. , 2005, Biometrics.

[21]  Michael P Epstein,et al.  A tobit variance-component method for linkage analysis of censored trait data. , 2003, American journal of human genetics.

[22]  R. Carroll,et al.  A new statistical method for estimating the usual intake of episodically consumed foods with application to their distribution. , 2006, Journal of the American Dietetic Association.

[23]  S. Raudenbush,et al.  Maximum Likelihood for Generalized Linear Models with Nested Random Effects via High-Order, Multivariate Laplace Approximation , 2000 .

[24]  K. Carey,et al.  Temporal stability of the timeline followback interview for alcohol and drug use with psychiatric outpatients. , 2004, Journal of studies on alcohol.

[25]  L H Moulton,et al.  A mixture model with detection limits for regression analyses of antibody response to vaccine. , 1995, Biometrics.

[26]  A. Basu,et al.  Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. , 2005, Biostatistics.

[27]  D. Stram,et al.  Variance components testing in the longitudinal mixed effects model. , 1994, Biometrics.

[28]  A. Azzalini A class of distributions which includes the normal ones , 1985 .

[29]  William A. Knaus,et al.  A random effects four-part model, with application to correlated medical costs , 2008, Comput. Stat. Data Anal..

[30]  Robert L. Strawderman,et al.  Use of the Probability Integral Transformation to Fit Nonlinear Mixed-Effects Models With Nonnormal Random Effects , 2006 .

[31]  J. Tobin Estimation of Relationships for Limited Dependent Variables , 1958 .

[32]  N. Johnston,et al.  A probit- log- skew-normal mixture model for repeated measures data with excess zeros, with application to a cohort study of paediatric respiratory symptoms , 2010, BMC medical research methodology.

[33]  Takeshi Amemiya,et al.  Introduction To Statistics And Econometrics , 1994 .

[34]  Jeng-Min Chiou,et al.  Estimated estimating equations: semiparametric inference for clustered and longitudinal data , 2005 .

[35]  Chenlei Leng,et al.  Semiparametric Mean–Covariance Regression Analysis for Longitudinal Data , 2009 .

[36]  L C Sobell,et al.  The reliability of a timeline method for assessing normal drinker college students' recent drinking history: utility for alcohol research. , 1986, Addictive behaviors.

[37]  L C Sobell,et al.  Reliability of a timeline method: assessing normal drinkers' reports of recent drinking and a comparative evaluation across several populations. , 1988, British journal of addiction.

[38]  Gene H. Golub,et al.  Calculation of Gauss quadrature rules , 1967, Milestones in Matrix Computation.

[39]  K. Berk,et al.  Repeated measures with zeros , 2002, Statistical methods in medical research.

[40]  James B. McDonald,et al.  A generalization of the beta distribution with applications , 1995 .

[41]  Rob J Hyndman,et al.  Applications: Generalized Additive Modelling of Mixed Distribution Markov Models with Application to Melbourne's Rainfall , 2000 .

[42]  Gary K Grunwald,et al.  Analysis of repeated measures data with clumping at zero , 2002, Statistical methods in medical research.

[43]  Haiyi Xie,et al.  A Method for Analyzing Longitudinal Outcomes with Many Zeros , 2004, Mental health services research.

[44]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[45]  Lei Liu,et al.  A flexible two-part random effects model for correlated medical costs. , 2010, Journal of health economics.