Statistical Issues in Modeling Chronic Disease in Cohort Studies

Observational cohort studies of individuals with chronic disease provide information on rates of disease progression, the effect of fixed and time-varying risk factors, and the extent of heterogeneity in the course of disease. Analysis of this information is often facilitated by the use of multistate models with intensity functions governing transition between disease states. We discuss modeling and analysis issues for such models when individuals are observed intermittently. Frameworks for dealing with heterogeneity and measurement error are discussed including random effect models, finite mixture models, and hidden Markov models. Cohorts are often defined by convenience and ways of addressing outcome-dependent sampling or observation of individuals are also discussed. Data on progression of joint damage in psoriatic arthritis and retinopathy in diabetes are analysed to illustrate these issues and related methodology.

[1]  Christopher H Jackson,et al.  Hidden Markov models for the onset and progression of bronchiolitis obliterans syndrome in lung transplant recipients , 2002, Statistics in medicine.

[2]  V T Farewell,et al.  Multi‐state Markov models for disease progression in the presence of informative examination times: An application to hepatitis C , 2010, Statistics in medicine.

[3]  N. Reid,et al.  Estimating Risks of Progressing to Aids when Covariates are Measured , 1993 .

[4]  J F Lawless,et al.  State duration models in clinical and observational studies. , 1999, Statistics in medicine.

[5]  Simon G. Thompson,et al.  Multistate Markov models for disease progression with classification error , 2003 .

[6]  J. F. Lawless,et al.  Duration analysis in longitudinal studies with intermittent observation times and losses to followup , 2012 .

[7]  O. Aalen Armitage lecture 2010: Understanding treatment effects: the value of integrating longitudinal data and survival analysis , 2012, Statistics in medicine.

[8]  D. Gladman,et al.  Radiological assessment in psoriatic arthritis. , 1998, British journal of rheumatology.

[9]  Rizopoulos Dimitris,et al.  Joint Modeling of Longitudinal and Time-to-Event Data , 2014 .

[10]  Geert Verbeke,et al.  Handbooks of Modern Statistical Methods Longitudinal Data Analysis , 2008 .

[11]  H. Frydman Maximum Likelihood Estimation in the Mover-Stayer Model , 1984 .

[12]  Ross L Prentice,et al.  Combined postmenopausal hormone therapy and cardiovascular disease: toward resolving the discrepancy between observational studies and the Women's Health Initiative clinical trial. , 2005, American journal of epidemiology.

[13]  Eric R. Ziegel,et al.  Multivariate Statistical Modelling Based on Generalized Linear Models , 2002, Technometrics.

[14]  H. Frydman Semiparametric estimation in a three-state duration-dependent Markov model from interval-censored observations with application to AIDS data. , 1995, Biometrics.

[15]  Aidan G O'Keeffe,et al.  Mixture distributions in multi-state modelling: Some considerations in a study of psoriatic arthritis , 2012, Statistics in medicine.

[16]  S. David Promislow Multi‐State Models , 2011 .

[17]  Andrew C Titman,et al.  Semi‐Markov Models with Phase‐Type Sojourn Distributions , 2010, Biometrics.

[18]  Richard J Kryscio,et al.  Transitions to mild cognitive impairments, dementia, and death: findings from the Nun Study. , 2007, American journal of epidemiology.

[19]  P. Andersen,et al.  The predictive effect of episodes on the risk of recurrence in depressive and bipolar disorders – a life‐long perspective , 2004, Acta psychiatrica Scandinavica.

[20]  Thomas Lumley,et al.  Using the whole cohort in the analysis of case-cohort data. , 2009, American journal of epidemiology.

[21]  H. Allore,et al.  A Semiparametric Transition Model with Latent Traits for Longitudinal Multistate Data , 2008, Biometrics.

[22]  Lars Vedel Kessing,et al.  Event dependent sampling of recurrent events , 2010, Lifetime data analysis.

[23]  Andrew C Titman,et al.  Model diagnostics for multi-state models , 2010, Statistical methods in medical research.

[24]  Rinku Sutradhar,et al.  Multiple SOD1/SFRS15 variants are associated with the development and progression of diabetic nephropathy: The DCCT/EDIC Genetics study , 2007 .

[25]  Peter Bacchetti,et al.  Non-Markov Multistate Modeling Using Time-Varying Covariates, with Application to Progression of Liver Fibrosis due to Hepatitis C Following Liver Transplant , 2010, The international journal of biostatistics.

[26]  R. Nelsen An Introduction to Copulas , 1998 .

[27]  B. Tom,et al.  Intermittent observation of time‐dependent explanatory variables: a multistate modelling approach , 2011, Statistics in medicine.

[28]  D. Gladman,et al.  Tracing studies and analysis of the effect of loss to follow‐up on mortality estimation from patient registry data , 2003 .

[29]  Early worsening of diabetic retinopathy in the Diabetes Control and Complications Trial. , 1998, Archives of ophthalmology.

[30]  Yang Yang,et al.  Parametric inference for time‐to‐failure in multi‐state semi‐Markov models: A comparison of marginal and process approaches , 2011 .

[31]  Torben Martinussen,et al.  Dynamic Regression Models for Survival Data , 2006 .

[32]  G A Satten,et al.  Estimating the Extent of Tracking in Interval‐Censored Chain‐Of‐Events Data , 1999, Biometrics.

[33]  Jerald F. Lawless,et al.  Semiparametric methods for response‐selective and missing data problems in regression , 1999 .

[34]  E W Lee,et al.  The analysis of correlated panel data using a continuous-time Markov model. , 1998, Biometrics.

[35]  Mark Wright,et al.  Estimated progression rates in three United Kingdom hepatitis C cohorts differed according to method of recruitment. , 2006, Journal of clinical epidemiology.

[36]  Anders Skrondal,et al.  Stratified Case‐Cohort Analysis of General Cohort Sampling Designs , 2007 .

[37]  J Grüger,et al.  The validity of inferences based on incomplete observations in disease state models. , 1991, Biometrics.

[38]  J. Klein,et al.  Generalised linear models for correlated pseudo‐observations, with applications to multi‐state models , 2003 .

[39]  Glen A. Satten,et al.  Markov Chains with Measurement Error: Estimating the ‘True’ Course of a Marker of the Progression of Human Immunodeficiency Virus Disease , 1996 .

[40]  Pamela A Shaw,et al.  Connections between Survey Calibration Estimators and Semiparametric Models for Incomplete Data , 2011, International statistical review = Revue internationale de statistique.

[41]  Baojiang Chen,et al.  Analysis of interval‐censored disease progression data via multi‐state models under a nonignorable inspection process , 2010, Statistics in medicine.

[42]  Somnath Datta,et al.  Validity of the Aalen–Johansen estimators of stage occupation probabilities and Nelson–Aalen estimators of integrated transition hazards for non-Markov models , 2001 .

[43]  N. Keiding,et al.  Multi-state models for event history analysis , 2002, Statistical methods in medical research.

[44]  Christopher H. Jackson,et al.  Multi-State Models for Panel Data: The msm Package for R , 2011 .

[45]  Fundus photographic risk factors for progression of diabetic retinopathy. ETDRS report number 12. Early Treatment Diabetic Retinopathy Study Research Group. , 1991, Ophthalmology.

[46]  Jerald F Lawless,et al.  Armitage Lecture 2011: the design and analysis of life history studies , 2013, Statistics in medicine.

[47]  Daniel Commenges,et al.  A penalized likelihood approach for an illness-death model with interval-censored data: application to age-specific incidence of dementia. , 2002, Biostatistics.

[48]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[49]  J F Lawless,et al.  Multi-state Markov models for analysing incomplete disease history data with illustrations for HIV disease. , 1994, Statistics in medicine.

[50]  Andrew C Titman,et al.  Flexible Nonhomogeneous Markov Models for Panel Observed Data , 2011, Biometrics.

[51]  Richard J. Cook,et al.  Robust Estimation of Mean Functions and Treatment Effects for Recurrent Events Under Event-Dependent Censoring and Termination: Application to Skeletal Complications in Cancer Metastatic to Bone , 2009 .

[52]  P. Grambsch Survival and Event History Analysis: A Process Point of View by AALEN, O. O., BORGAN, O., and GJESSING, H. K. , 2009 .

[53]  Aidan G O'Keeffe,et al.  A case-study in the clinical epidemiology of psoriatic arthritis: multistate models and causal arguments , 2011, Journal of the Royal Statistical Society. Series C, Applied statistics.

[54]  J. Klein,et al.  Statistical Models Based On Counting Process , 1994 .

[55]  R Kay,et al.  A Markov model for analysing cancer markers and disease states in survival studies. , 1986, Biometrics.

[56]  R. Cook,et al.  Analysis of interval‐censored data from clustered multistate processes: application to joint damage in psoriatic arthritis , 2008 .

[57]  G. Molenberghs,et al.  Longitudinal data analysis , 2008 .

[58]  Jerald F. Lawless,et al.  Analysis of repeated failures or durations, with application to shunt failures for patients with paediatric hydrocephalus , 2001 .

[59]  D. Gladman,et al.  Risk Factors for Axial Inflammatory Arthritis in Patients with Psoriatic Arthritis , 2010, The Journal of Rheumatology.

[60]  O Borgan,et al.  Covariate Adjustment of Event Histories Estimated from Markov Chains: The Additive Approach , 2001, Biometrics.

[61]  Laurence L. George,et al.  The Statistical Analysis of Failure Time Data , 2003, Technometrics.

[62]  O. Aalen A linear regression model for the analysis of life times. , 1989, Statistics in medicine.

[63]  Richard J. Cook,et al.  Robust Estimation of State Occupancy Probabilities for Interval-Censored Multistate Data: An Application Involving Spondylitis in Psoriatic Arthritis , 2009 .

[64]  Vinod Chandran,et al.  Soluble biomarkers differentiate patients with psoriatic arthritis from those with psoriasis without arthritis. , 2010, Rheumatology.

[65]  Richard J. Cook,et al.  The Statistical Analysis of Recurrent Events , 2007 .

[66]  Jessica K Barrett,et al.  A semi-competing risks model for data with interval-censoring and informative observation: An application to the MRC cognitive function and ageing study , 2010, Statistics in medicine.

[67]  B. Turnbull The Empirical Distribution Function with Arbitrarily Grouped, Censored, and Truncated Data , 1976 .

[68]  Andrzej S Krolewski,et al.  Genetics of Kidneys in Diabetes (GoKinD) study: a genetics collection available for identifying genetic susceptibility factors for diabetic nephropathy in type 1 diabetes. , 2006, Journal of the American Society of Nephrology : JASN.

[69]  Halina Frydman,et al.  A Nonparametric Estimation Procedure for a Periodically Observed Three‐State Markov Process, with Application to Aids , 1992 .

[70]  J P Klein,et al.  Multi‐state models and outcome prediction in bone marrow transplantation , 2001, Statistics in medicine.

[71]  S. Genuth,et al.  The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. , 1993, The New England journal of medicine.

[72]  V T Farewell,et al.  Incorporating retrospective data into an analysis of time to illness. , 2001, Biostatistics.

[73]  J. Griffiths The Theory of Stochastic Processes , 1967 .

[74]  V T Farewell,et al.  Clinical indicators of progression in psoriatic arthritis: multivariate relative risk model. , 1995, The Journal of rheumatology.

[75]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[76]  Geert Molenberghs,et al.  Longitudinal Data Analysis. Handbooks of Modern Statistical Methods , 2009 .

[77]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data: Kalbfleisch/The Statistical , 2002 .

[78]  Rebecca A Betensky,et al.  Estimating time-to-event from longitudinal ordinal data using random-effects Markov models: application to multiple sclerosis progression. , 2008, Biostatistics.

[79]  J. Kalbfleisch,et al.  The Analysis of Panel Data under a Markov Assumption , 1985 .

[80]  Mei‐jie Zhang,et al.  An Additive–Multiplicative Cox–Aalen Regression Model , 2002 .

[81]  Alexandre Bureau,et al.  Applications of continuous time hidden Markov models to the study of misclassified disease outcomes , 2003, Statistics in medicine.

[82]  Michael J Pencina,et al.  Choice of time scale and its effect on significance of predictors in longitudinal studies , 2007, Statistics in medicine.

[83]  P Hougaard,et al.  Multi-state Models: A Review , 1999, Lifetime data analysis.

[84]  L. Fahrmeir,et al.  Multivariate statistical modelling based on generalized linear models , 1994 .

[85]  Joseph W Hogan,et al.  Handling drop‐out in longitudinal studies , 2004, Statistics in medicine.

[86]  Halina Frydman,et al.  Nonparametric Estimation in a Markov “Illness–Death” Process from Interval Censored Observations with Missing Intermediate Transition Status , 2009, Biometrics.

[87]  C. Granger Investigating Causal Relations by Econometric Models and Cross-Spectral Methods , 1969 .

[88]  D Commenges,et al.  Multi-state Models in Epidemiology , 1999, Lifetime data analysis.

[89]  Leo A. Goodman,et al.  Statistical Methods for the Mover-Stayer Model , 1961 .

[90]  A. Paterson,et al.  Multiple Superoxide Dismutase 1/Splicing Factor Serine Alanine 15 Variants Are Associated With the Development and Progression of Diabetic Nephropathy , 2008, Diabetes.

[91]  L. J. Wei,et al.  Regression analysis of multivariate incomplete failure time data by modeling marginal distributions , 1989 .