Summary This article applies a simple method for settings where one has clustered data, but statistical methods are only available for independent data. We assume the statistical method provides us with a normally distributed estimate, , and an estimate of its variance . We randomly select a data point from each cluster and apply our statistical method to this independent data. We repeat this multiple times, and use the average of the associated as our estimate. An estimate of the variance is given by the average of the minus the sample variance of the . We call this procedure multiple outputation, as all “excess” data within each cluster is thrown out multiple times. Hoffman, Sen, and Weinberg (2001, Biometrika88, 1121–1134) introduced this approach for generalized linear models when the cluster size is related to outcome. In this article, we demonstrate the broad applicability of the approach. Applications to angular data, p‐values, vector parameters, Bayesian inference, genetics data, and random cluster sizes are discussed. In addition, asymptotic normality of estimates based on all possible outputations, as well as a finite number of outputations, is proven given weak conditions. Multiple outputation provides a simple and broadly applicable method for analyzing clustered data. It is especially suited to settings where methods for clustered data are impractical, but can also be applied generally as a quick and simple tool.
[1]
K. Mardia.
Statistics of Directional Data
,
1972
.
[2]
M. D. Hogan,et al.
Selection of the experimental unit in teratology studies.
,
1975,
Teratology.
[3]
J K Haseman,et al.
The distribution of fetal death in control mice and its implications on statistical tests for dominant lethal effects.
,
1976,
Mutation research.
[4]
L L Kupper,et al.
The use of a correlated binomial model for the analysis of certain toxicological experiments.
,
1978,
Biometrics.
[5]
J. Ware,et al.
Random-effects models for longitudinal data.
,
1982,
Biometrics.
[6]
S. Zeger,et al.
Longitudinal data analysis using generalized linear models
,
1986
.
[7]
N. Breslow,et al.
Approximate inference in generalized linear mixed models
,
1993
.
[8]
D. Rubin.
Multiple Imputation After 18+ Years
,
1996
.
[9]
Leukocyte reduction and ultraviolet B irradiation of platelets to prevent alloimmunization and refractoriness to platelet transfusions.
,
1997,
The New England journal of medicine.
[10]
C. McCulloch.
Maximum Likelihood Algorithms for Generalized Linear Mixed Models
,
1997
.
[11]
Pranab Kumar Sen,et al.
Within‐cluster resampling
,
2001
.