Analysis of clustered data in community psychology: With an example from a worksite smoking cessation project

Although it is common in community psychology research to have data at both the community, or cluster, and individual level, the analysis of such clustered data often presents difficulties for many researchers. Since the individuals within the cluster cannot be assumed to be independent, the use of many traditional statistical techniques that assumes independence of observations is problematic. Further, there is often interest in assessing the degree of dependence in the data resulting from the clustering of individuals within communities. In this paper, a random-effects regression model is described for analysis of clustered data. Unlike ordinary regression analysis of clustered data, random-effects regression models do not assume that each observation is independent, but do assume data within clusters are dependent to some degree. The degree of this dependency is estimated along with estimates of the usual model parameters, thus adjusting these effects for the dependency resulting from the clustering of the data. Models are described for both continuous and dichotomous outcome variables, and available statistical software for these models is discussed. An analysis of a data set where individuals are clustered within firms is used to illustrate fetatures of random-effects regression analysis, relative to both individual-level analysis which ignores the clustering of the data, and cluster-level analysis which aggregates the individual data.

[1]  K. Hopkins The Unit of Analysis: Group Means Versus Individual Observations , 1982 .

[2]  M. Aitkin,et al.  Statistical Modelling Issues in School Effectiveness Studies , 1986 .

[3]  Jan de Leeuw,et al.  Random Coefficient Models for Multilevel Analysis , 1986 .

[4]  J. L. Schwartz Review and evaluation of smoking cessation methods : the United States and Canada, 1978-1985 , 1987 .

[5]  S. S. Wilks The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses , 1938 .

[6]  Donald Hedeker,et al.  Application of Random-Effects Probit Regression Models , 1994 .

[7]  M. Conaway Analysis of Repeated Categorical Measurements with Conditional Likelihood Methods , 1989 .

[8]  D. Hedeker,et al.  A random-effects ordinal regression model for multilevel analysis. , 1994, Biometrics.

[9]  L. Poundie Burstein Chapter 4: The Analysis of Multilevel Data in Educational Research and Evaluation , 1980 .

[10]  R. Bosker Boekbespreking van "A.S. Bryk & S.W. Raudenbusch - Hierarchical linear models: Applications and data analysis methods" : Sage Publications, Newbury Parki, London/New Delhi 1992 , 1995 .

[11]  S D Imber,et al.  Some conceptual and statistical issues in analysis of longitudinal psychiatric data. Application to the NIMH treatment of Depression Collaborative Research Program dataset. , 1993, Archives of general psychiatry.

[12]  H. Goldstein,et al.  Multilevel Models in Educational and Social Research. , 1989 .

[13]  D. A. Kenny,et al.  Levels of Analysis and Effects: Clarifying Group Influence and Climate by Separating Individual and Group Effects , 1990 .

[14]  L. Jason,et al.  Designing an effective worksite smoking cessation program using self-help manuals, incentives, groups and media , 1991 .

[15]  Anthony S. Bryk,et al.  A Hierarchical Model for Studying School Effects , 1986 .

[16]  D. Hedeker,et al.  Random regression models: a comprehensive approach to the analysis of longitudinal psychiatric data. , 1988, Psychopharmacology bulletin.

[17]  D. Zucker,et al.  Research Design and Analysis Issues , 1989, Health education quarterly.

[18]  D. Hedeker,et al.  Random-effects regression models for clustered data with an example from smoking prevention research. , 1994, Journal of consulting and clinical psychology.

[19]  H. Goldstein Nonlinear multilevel models, with an application to discrete response data , 1991 .

[20]  Murray Aitkin,et al.  Variance Component Models with Binary Response: Interviewer Variability , 1985 .

[21]  D. Hedeker,et al.  Random regression models for multicenter clinical trials data. , 1991, Psychopharmacology bulletin.

[22]  Anthony S. Bryk,et al.  Application of Hierarchical Linear Models to Assessing Change , 1987 .

[23]  S. R. Searle,et al.  Linear Models For Unbalanced Data , 1988 .

[24]  R. Barker Ecological Psychology: Concepts and Methods for Studying the Environment of Human Behavior , 1968 .

[25]  Scott L. Zeger,et al.  The analysis of binary longitudinal data with time independent covariates , 1985 .

[26]  S. Sarason Revisiting "The culture of the school and the problem of change" , 1971 .

[27]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[28]  J. G. Kelly Ecological constraints on mental health services. , 1966, The American psychologist.

[29]  Robert W. Mee,et al.  A mixed-model procedure for analyzing ordered categorical data , 1984 .

[30]  Murray Levine,et al.  Principles of Community Psychology: Perspectives and Applications , 1987 .

[31]  J. Ware,et al.  Random-effects models for serial observations with binary response. , 1984, Biometrics.

[32]  N. Longford A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects , 1987 .

[33]  P. Hannan,et al.  Analysis Issues in School-Based Health Promotion Studies , 1989, Health education quarterly.

[34]  R. Glasgow,et al.  Occupational health promotion programs to reduce cardiovascular risk. , 1988, Journal of consulting and clinical psychology.

[35]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[36]  Rudolf H. Moos,et al.  The human context: Environmental determinants of behavior , 1976 .

[37]  R. Glasgow,et al.  Worksite smoking modification programs: A state-of-the-art review and directions for future research , 1987 .

[38]  Robert D. Gibbons,et al.  Trend in correlated proportions , 1987 .

[39]  Brian R. Flay,et al.  Levels of analysis , 1989 .

[40]  L. Poundie Burstein The Analysis of Multilevel Data in Educational Research and Evaluation , 1980 .

[41]  J E Fielding,et al.  Health promotion and disease prevention at the worksite. , 1984, Annual review of public health.

[42]  A Donner,et al.  A regression approach to the analysis of data arising from cluster randomization. , 1985, International journal of epidemiology.

[43]  A. Wald Tests of statistical hypotheses concerning several parameters when the number of observations is large , 1943 .

[44]  T. Pechacek,et al.  Occupational and worksite norms and attitudes about smoking cessation. , 1986, American journal of public health.

[45]  G. Y. Wong,et al.  The Hierarchical Logistic Regression Model for Multilevel Analysis , 1985 .

[46]  Anthony S. Bryk,et al.  Methodological Advances in Analyzing the Effects of Schools and Classrooms on Student Learning , 1988 .

[47]  J. Jansen,et al.  On the Statistical Analysis of Ordinal Data When Extravariation is Present , 1990 .

[48]  A Donner,et al.  An empirical study of cluster randomization. , 1982, International journal of epidemiology.

[49]  C Waternaux,et al.  Investigating drug plasma levels and clinical response using random regression models. , 1989, Psychopharmacology bulletin.

[50]  M. Shinn Mixing and matching: Levels of conceptualization, measurement, and statistical analysis in community research. , 1990 .

[51]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[52]  A random effects model for ordinal responses from a crossover trial , 1991 .

[53]  D. A. Kenny,et al.  Separating individual and group effects , 1985 .