Analysis of Correlated Data with SAS and R

PREFACE TO THE FIRST EDITION PREFACE TO THE SECOND EDITION PREFACE TO THE THIRD EDITION ANALYZING CLUSTERED DATA Regression Analysis for Clustered Data Generalized Linear Models Fitting Alternative Models for Clustered Data ANALYSIS OF CROSS-CLASSIFIED DATA Measures of Association in 2 x 2 Tables Analysis of Several 2 x 2 Contingency Tables Analysis of 1:1 Matched Pairs Statistical Analysis of Clustered Binary Data Sample Size Requirements for Clustered Binary Data Discussion MODELING BINARY OUTCOME DATA The Logistic Regression Model Modeling Correlated Binary Outcome Data Logistic Regression for Case-Control Studies Sample-Size Calculations for Logistic Regression ANALYSIS OF CLUSTERED COUNT DATA Poisson Regression Model Inference and Goodness of Fit Over-Dispersion in Count Data Count Data Random Effects Models Other Models ANALYSIS OF TIME SERIES Simple Descriptive Methods Fundamental Concepts in the Analysis of Time Series Models for Stationary Time Series ARIMA Models Forecasting Modeling Seasonality with ARIMA: The Condemnation Rates Series Revisited REPEATED MEASURES AND LONGITUDINAL DATA ANALYSIS Methods for the Analysis of Repeated Measures Data Mixed Linear Regression Models Examples Using the SAS Mixed and GLIMMIX Procedures SURVIVAL DATA ANALYSIS Examples Estimating the Survival Probabilities Modeling Correlated Survival Data Sample Size Requirements for Survival Data REFERENCES INDEX Introductions appear at the beginning of each chapter.

[1]  James J Schlesselman Case-Control Studies: Design, Conduct, Analysis , 1982 .

[2]  K. Manton,et al.  Methods for evaluating the heterogeneity of aging processes in human populations using vital statistics data: explaining the black/white mortality crossover by a model of mortality selection. , 1981, Human biology.

[3]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[4]  Valérie Buthion,et al.  ColoNav: patient navigation for colorectal cancer screening in deprived areas – Study protocol , 1999, BMC Cancer.

[5]  William G. Cochran,et al.  Sampling Techniques, 3rd Edition , 1963 .

[6]  S D Walter,et al.  Small sample estimation of log odds ratios from logistic regression and fourfold tables. , 1985, Statistics in medicine.

[7]  J. Harley,et al.  A step-up procedure for selecting variables associated with survival. , 1975, Biometrics.

[8]  J. Kalbfleisch,et al.  A Comparison of Cluster-Specific and Population-Averaged Approaches for Analyzing Correlated Binary Data , 1991 .

[9]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[10]  D. Cox,et al.  Analysis of Binary Data (2nd ed.). , 1990 .

[11]  N. Breslow,et al.  Estimation of multiple relative risk functions in matched case-control studies. , 1978, American journal of epidemiology.

[12]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[13]  R. Tarone,et al.  Testing the goodness of fit of the binomial distribution , 1979 .

[14]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[15]  Barry H. Margolin,et al.  Testing Goodness of Fit for the Poisson Assumption When Observations are Not Identically Distributed , 1985 .

[16]  J. Grimshaw,et al.  Do clinical guidelines improve general practice management and referral of infertile couples? , 1993, BMJ.

[17]  A Donner,et al.  Statistical methods in ophthalmology: an adjusted chi-square approach. , 1989, Biometrics.

[18]  R. Prentice,et al.  Correlated binary regression with covariates specific to each binary observation. , 1988, Biometrics.

[19]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[20]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[21]  Susan R. Wilson,et al.  Calculating Sample Sizes in the Presence of Confounding Variables , 1986 .

[22]  M. Bartlett On the Theoretical Specification and Sampling Properties of Autocorrelated Time‐Series , 1946 .

[23]  T. Odom-Maryon,et al.  Estimation of a common odds ratio under binary cluster sampling. , 1995, Statistics in medicine.

[24]  K. G. Janardan,et al.  Biological Applications of the Lagrangian Poisson Distribution , 1979 .

[25]  Barry McDonald,et al.  Estimating Logistic Regression Parameters for Bivariate Binary Data , 1993 .

[26]  D. A. Williams,et al.  The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. , 1975, Biometrics.

[27]  B. Haldane THE ESTIMATION AND SIGNIFICANCE OF THE LOGARITHM OF A RATIO OF FREQUENCIES , 1956, Annals of human genetics.

[28]  Scott L. Zeger,et al.  Generalized linear models with random e ects: a Gibbs sampling approach , 1991 .

[29]  Alice S. Whittemore,et al.  Sample Size for Logistic Regression with Small Response Probability , 1981 .

[30]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[31]  D. Collett,et al.  Modeling Binary Data. , 1993 .

[32]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[33]  John W. Tukey,et al.  Data Analysis and Regression: A Second Course in Statistics , 1977 .

[34]  P. Consul,et al.  A Generalization of the Poisson Distribution , 1973 .

[35]  R. Elston,et al.  Query: estimating "heritability" of a dichotomous trait. , 1977, Biometrics.

[36]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[37]  P. D. Oldham,et al.  A study of arterial blood pressure and its inheritance in a sample of the general population. , 1955, Clinical science.

[38]  O. Kempthorne,et al.  The Estimation of Heritability by Regression of Offspring on Parent , 1953 .

[39]  D. Schoenfeld,et al.  Sample-size formula for the proportional-hazards regression model. , 1983, Biometrics.

[40]  A. Donner,et al.  Randomization by cluster. Sample size requirements and analysis. , 1981, American journal of epidemiology.

[41]  S J Pocock,et al.  Repeated measures in clinical trials: analysis using mean summary statistics and its implications for design. , 1992, Statistics in medicine.

[42]  A S Whittemore,et al.  Methods for analyzing panel studies of acute health effects of air pollution. , 1979, Biometrics.

[43]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[44]  R. Shumway Longitudinal data with serial correlation: A state-space approach , 1995 .

[45]  C. S. Weil Selection of the valid number of sampling units and a consideration of their combination in toxicological studies involving reproduction, teratogenesis or carcinogenesis. , 1970, Food and cosmetics toxicology.

[46]  J P Klein,et al.  Testing for centre effects in multi-centre survival studies: a Monte Carlo comparison of fixed and random effects tests. , 1999, Statistics in medicine.

[47]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[48]  W. G. Cochran Problems arising in the analysis of a series of similar experiments , 1937 .

[49]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[50]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[51]  Richard H. Jones Analysis of repeated measures , 1992 .

[52]  M. Eliasziw,et al.  A comparison of methods for testing homogeneity of proportions in teratologic studies. , 1994, Statistics in medicine.

[53]  D. F. Andrews,et al.  Data : a collection of problems from many fields for the student and research worker , 1985 .

[54]  L. Tabár,et al.  REDUCTION IN MORTALITY FROM BREAST CANCER AFTER MASS SCREENING WITH MAMMOGRAPHY Randomised Trial from the Breast Cancer Screening Working Group of the Swedish National Board of Health and Welfare , 1985, The Lancet.

[55]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[56]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[57]  C. Perry,et al.  Results from a statewide approach to adolescent tobacco use prevention. , 1992, Preventive medicine.

[58]  Peter Bloomfield,et al.  Fourier Analysis of Time Series: An Introduction , 1977 .

[59]  W. A. Geering,et al.  Veterinary epidemiology and economics. Proceedings of the Second International Symposium on Veterinary Epidemiology and Economics, held at Canberra, Australia, 7-11 May 1979. , 1980 .

[60]  Judith D. Singer,et al.  Using SAS PROC MIXED to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models , 1998 .

[61]  Farid Kianifard,et al.  Models for Repeated Measurements , 2001, Technometrics.

[62]  P. Diggle An approach to the analysis of repeated measurements. , 1988, Biometrics.

[63]  P. Hougaard,et al.  Frailty models for survival data , 1995, Lifetime data analysis.

[64]  R. Connor Sample size for testing differences in proportions for the paired-sample design. , 1987, Biometrics.

[65]  J. T. Wulu,et al.  Regression analysis of count data , 2002 .

[66]  Musa H. Asyali,et al.  The Poisson Inverse Gaussian Regression Model in the Analysis of Clustered Counts Data , 2021, Journal of Data Science.

[67]  S D Walter,et al.  A comparison of several point estimators of the odds ratio in a single 2 x 2 contingency table. , 1991, Biometrics.

[68]  R. Hayes,et al.  Simple sample size calculation for cluster-randomized trials. , 1999, International journal of epidemiology.

[69]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[70]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[71]  R. Wolfinger,et al.  Generalized linear mixed models a pseudo-likelihood approach , 1993 .

[72]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[73]  M. Russell,et al.  Effect of nicotine chewing gum as an adjunct to general practitioners' advice against smoking , 1984, British medical journal.

[74]  Nicholas P. Jewell,et al.  On the Bias of Commonly Used Measures of Association for 2 x 2 Tables , 1986 .

[75]  A Donner,et al.  An empirical study of cluster randomization. , 1982, International journal of epidemiology.

[76]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data. , 1983 .

[77]  Gordon E. Willmot,et al.  The Poisson-Inverse Gaussian distribution as an alternative to the negative binomial , 1987 .

[78]  LambertDiane Zero-inflated Poisson regression, with an application to defects in manufacturing , 1992 .

[79]  V. Carey,et al.  Mixed-Effects Models in S and S-Plus , 2001 .

[80]  J. Lindsey Models for Repeated Measurements , 1993 .

[81]  Williams Da,et al.  The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. , 1975 .

[82]  G. Yule On a Method of Investigating Periodicities in Disturbed Series, with Special Reference to Wolfer's Sunspot Numbers , 1927 .

[83]  G. Box,et al.  Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models , 1970 .

[84]  D G Thomas,et al.  The performance of three approximate confidence limit methods for the odds ratio. , 1982, American journal of epidemiology.

[85]  L. Tabár,et al.  The Swedish Two-County Trial of mammographic screening: cluster randomisation and end point evaluation. , 2003, Annals of oncology : official journal of the European Society for Medical Oncology.

[86]  R. Potthoff,et al.  A generalized multivariate analysis of variance model useful especially for growth curve problems , 1964 .

[87]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[88]  P. Albert,et al.  Models for longitudinal data: a generalized estimating equation approach. , 1988, Biometrics.

[89]  J. Peto,et al.  Asymptotically Efficient Rank Invariant Test Procedures , 1972 .

[90]  A. Scott,et al.  A simple method for the analysis of clustered binary data. , 1992, Biometrics.

[91]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[92]  Michael R. Kosorok,et al.  Sample‐size formula for clustered survival data using weighted log‐rank statistics , 2004 .

[93]  J. Gart,et al.  A Table of Exact Confidence Limits for Differences and Ratios of Two Proportions and Their Odds Ratios , 1977 .

[94]  Walter W. Hauck,et al.  The Large Sample Variance of the Mantel-Haenszel Estimator of a Common Odds Ratio , 1979 .

[95]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[96]  B. Woolf ON ESTIMATING THE RELATION BETWEEN BLOOD GROUP AND DISEASE , 1955, Annals of human genetics.

[97]  M. Lesser,et al.  Effect of wheat fiber and vitamins C and E on rectal polyps in patients with familial adenomatous polyposis. , 1989, Journal of the National Cancer Institute.

[98]  Johannes Ledolter,et al.  Statistical methods for forecasting , 1983 .

[99]  N Breslow,et al.  Estimators of the Mantel-Haenszel variance consistent in both sparse data and large-strata limiting models. , 1986, Biometrics.

[100]  Gordon Johnston,et al.  Statistical Models and Methods for Lifetime Data , 2003, Technometrics.

[101]  D. Clayton A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence , 1978 .

[102]  C. Le,et al.  Duration of ventilating tubes: a test for comparing two clustered samples of censored data. , 1996, Biometrics.

[103]  M. Gail The determination of sample sizes for trials involving several independent 2x2 tables. , 1973, Journal of chronic diseases.

[104]  P. Thall,et al.  Some covariance models for longitudinal count data with overdispersion. , 1990, Biometrics.

[105]  J. Fleiss,et al.  Confidence intervals for the odds ratio in case-control studies: the state of the art. , 1979, Journal of chronic diseases.

[106]  T. Stukel,et al.  Comparison of methods for the analysis of longitudinal interval count data. , 1993, Statistics in medicine.

[107]  Oldham Pd,et al.  A study of arterial blood pressure and its inheritance in a sample of the general population. , 1955 .

[108]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[109]  R. Schall Estimation in generalized linear models with random effects , 1991 .

[110]  A Sommer,et al.  Estimation of design effects and diarrhea clustering within households and villages. , 1993, American journal of epidemiology.

[111]  O S Miettinen,et al.  The matched pairs design in the case of all-or-none responses. , 1968, Biometrics.

[112]  P. Diggle,et al.  Testing for random dropouts in repeated measurement data. , 1989, Biometrics.

[113]  O. Aalen,et al.  Analyzing incidence of testis cancer by means of a frailty model , 1999, Cancer Causes & Control.

[114]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[115]  R. H. Shumway,et al.  1 Discriminant analysis for time series , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[116]  N. Mantel Synthetic retrospective studies and related topics. , 1973, Biometrics.

[117]  J Cornfield,et al.  Randomization by group: a formal analysis. , 1978, American journal of epidemiology.

[118]  R. Prentice Use of the logistic model in retrospective studies. , 1976, Biometrics.

[119]  J. Ware,et al.  Issues in the analysis of repeated categorical outcomes. , 1988, Statistics in medicine.

[120]  N P Jewell,et al.  Small-sample bias of point estimators of the odds ratio from matched sets. , 1984, Biometrics.

[121]  Seymour Geisser,et al.  Multivariate Analysis of Variance for a Special Covariance Case , 1963 .

[122]  David R. Cox The analysis of binary data , 1970 .

[123]  S L Zeger,et al.  Regression analysis for correlated data. , 1993, Annual review of public health.

[124]  G. Fitzmaurice,et al.  A caveat concerning independence estimating equations with multivariate binary data. , 1995, Biometrics.

[125]  S. Paul Analysis of proportions of affected foetuses in teratological experiments. , 1982, Biometrics.

[126]  G. Box,et al.  On a measure of lack of fit in time series models , 1978 .

[127]  J. Durbin,et al.  Testing for serial correlation in least squares regression. II. , 1950, Biometrika.

[128]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[129]  Gordon E. Willmot,et al.  A mixed poisson–inverse‐gaussian regression model , 1989 .

[130]  D. Kleinbaum,et al.  Applied Regression Analysis and Other Multivariate Methods , 1978 .

[131]  B. Roizman,et al.  Linear and parabolic estimates of the titers of herpes simplex from pock counts on the chorioallantoic membrane of embryonated eggs. , 1960, Virology.

[132]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[133]  P. Diggle Analysis of Longitudinal Data , 1995 .

[134]  J M Bland,et al.  The intracluster correlation coefficient in cluster randomisation , 1998, BMJ.

[135]  M. Greenwood,et al.  An Inquiry into the Nature of Frequency Distributions Representative of Multiple Happenings with Particular Reference to the Occurrence of Multiple Attacks of Disease or of Repeated Accidents , 1920 .