Likelihood‐based methods for regression analysis with binary exposure status assessed by pooling

The need for resource-intensive laboratory assays to assess exposures in many epidemiologic studies provides ample motivation to consider study designs that incorporate pooled samples. In this paper, we consider the case in which specimens are combined for the purpose of determining the presence or absence of a pool-wise exposure, in lieu of assessing the actual binary exposure status for each member of the pool. We presume a primary logistic regression model for an observed binary outcome, together with a secondary regression model for exposure. We facilitate maximum likelihood analysis by complete enumeration of the possible implications of a positive pool, and we discuss the applicability of this approach under both cross-sectional and case-control sampling. We also provide a maximum likelihood approach for longitudinal or repeated measures studies where the binary outcome and exposure are assessed on multiple occasions and within-subject pooling is conducted for exposure assessment. Simulation studies illustrate the performance of the proposed approaches along with their computational feasibility using widely available software. We apply the methods to investigate gene-disease association in a population-based case-control study of colorectal cancer.

[1]  H. Welch,et al.  Statins and the Risk of Colorectal Cancer , 2005, The New England journal of medicine.

[2]  Enrique F Schisterman,et al.  Estimation of ROC curves based on stably distributed biomarkers subject to measurement error and pooling mixtures , 2008, Statistics in medicine.

[3]  R. Dorfman The Detection of Defective Members of Large Populations , 1943 .

[4]  Zhiwei Zhang,et al.  Binary Regression Analysis with Pooled Exposure Measurements: A Regression Calibration Approach , 2011, Biometrics.

[5]  Paul S Albert,et al.  Pooling Designs for Outcomes under a Gaussian Random Effects Model , 2012, Biometrics.

[6]  Steven Gallinger,et al.  Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer , 2008, Nature Genetics.

[7]  Albert Vexler,et al.  To pool or not to pool, from whether to when: applications of pooling to biospecimens subject to a limit of detection. , 2008, Paediatric and perinatal epidemiology.

[8]  Milton Sobel,et al.  Group testing with a new goal, estimation , 1975 .

[9]  C R Weinberg,et al.  Using Pooled Exposure Assessment to Improve Efficiency in Case‐Control Studies , 1999, Biometrics.

[10]  J. LaFountain Inc. , 2013, American Art.

[11]  Steven D. Edland,et al.  Contributions to composite sampling , 2001, Environmental and Ecological Statistics.

[12]  S Vansteelandt,et al.  Regression Models for Disease Prevalence with Diagnostic Tests on Pools of Serum Samples , 2000, Biometrics.

[13]  L. Skovgaard NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA. , 1996 .

[14]  Joshua M Tebbs,et al.  Group Testing Regression Models with Fixed and Random Effects , 2009, Biometrics.

[15]  N Breslow,et al.  Are there two logistic regressions for retrospective studies? , 1978, Biometrics.

[16]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[17]  Enrique F Schisterman,et al.  Pooling biospecimens and limits of detection: effects on ROC curve analysis. , 2006, Biostatistics.

[18]  Enrique F Schisterman,et al.  Hybrid pooled–unpooled design for cost‐efficient measurement of biomarkers , 2010, Statistics in medicine.

[19]  G. P. Patil,et al.  Annotated bibliography of composite sampling Part A: 1936–92 , 2004, Environmental and Ecological Statistics.

[20]  Enrique F. Schisterman,et al.  Comparison of Diagnostic Accuracy of Biomarkers With Pooled Assessments , 2003 .

[21]  P. Diggle Analysis of Longitudinal Data , 1995 .

[22]  R Brookmeyer,et al.  Analysis of Multistage Pooling Studies of Biological Specimens for Estimating Disease Incidence and Prevalence , 1999, Biometrics.