Cost-efficient study designs for binary response data with Gaussian covariate measurement error.

When mismeasurement of the exposure variable is anticipated, epidemiologic cohort studies may be augmented to include a validation study, where a small sample of data relating the imperfect exposure measurement method to the better method is collected. Optimal study designs (i.e., least expensive subject to specified power constraints) are developed that give the overall sample size and proportion of the overall sample size allocated to the validation study. If better exposure measurements can be collected on a sample of subjects, an optimal design can be suggested that conforms to realistic budgetary constraints. The properties of three designs--those that include an internal validation study, those where the validated subsample is derived from subjects external to the primary investigation, and those that use the better method of exposure assessment on all subjects--are compared. The proportion of overall study resources allocated to the validation substudy increases with increasing sample disease frequency, decreasing unit cost of the superior exposure measurement relative to the imperfect one, increasing unit cost of outcome ascertainment, increasing distance between two alternative values of the relative risk between which the study is designed to discriminate, and increasing magnitude of hypothesized values. This proportion also depends in a nonlinear fashion on the severity of measurement error, and when the validation study is internal, measurement error reaches a point after which the optimal design is the smaller, fully validated one.

[1]  W. Willett,et al.  Reproducibility and validity of a semiquantitative food frequency questionnaire. , 1985, American journal of epidemiology.

[2]  B BALKE,et al.  An experimental study of physical fitness of Air Force personnel. , 1959, United States Armed Forces medical journal.

[3]  S Greenland,et al.  Statistical uncertainty due to misclassification: implications for validation substudies. , 1988, Journal of clinical epidemiology.

[4]  W. G. Cochran Errors of Measurement in Statistics , 1968 .

[5]  Daniel W. Schafer,et al.  Covariate measurement error in generalized linear models , 1987 .

[6]  B. Brunekreef,et al.  Variability of exposure measurements in environmental epidemiology. , 1987, American journal of epidemiology.

[7]  Raymond J. Carroll,et al.  Covariate Measurement Error in Logistic Regression , 1985 .

[8]  D. Lilienfeld Changing Research Methods in Environmental Epidemiology , 1988 .

[9]  K J Rothman,et al.  A show of confidence. , 1978, The New England journal of medicine.

[10]  B Rosner,et al.  Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. , 2006, Statistics in medicine.

[11]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[12]  Gail Gong,et al.  Pseudo Maximum Likelihood Estimation: Theory and Applications , 1981 .

[13]  Steven N. Blair,et al.  A MAIL SURVEY OF PHYSICAL ACTIVITY HABITS AS RELATED TO MEASURED PHYSICAL FITNESS , 1988 .

[14]  Leonard A. Stefanski,et al.  The effects of measurement error on parameter estimation , 1985 .

[15]  Raymond J. Carroll,et al.  Conditional scores and optimal scores for generalized linear measurement-error models , 1987 .

[16]  E. Crouch,et al.  The Evaluation of Integrals of the form ∫+∞ −∞ f(t)exp(−t 2) dt: Application to Logistic-Normal Models , 1990 .

[17]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[18]  C B Begg,et al.  A treatment allocation procedure for sequential clinical trials. , 1980, Biometrics.

[19]  B G Armstrong,et al.  Analysis of case-control data with covariate measurement error: application to diet and colon cancer. , 1989, Statistics in medicine.

[20]  D. Kriebel,et al.  Pulmonary function in beryllium workers: assessment of exposure. , 1988, British journal of industrial medicine.

[21]  D. Albanes,et al.  Dietary fat and risk of breast cancer. , 1990, The American journal of clinical nutrition.

[22]  S Greenland,et al.  On sample-size and power calculations for studies using confidence intervals. , 1988, American journal of epidemiology.

[23]  Raymond J. Carroll,et al.  On errors-in-variables for binary regression models , 1984 .