Tree‐based recursive partitioning methods for subdividing sibpairs into relatively more homogeneous subgroups

We propose a new splitting rule for recursively partitioning sibpair data into relatively more homogeneous subgroups. This strategy is designed to identify subgroups of sibpairs such that within‐subgroup analyses result in increased power to detect linkage using Haseman‐Elston regression. We assume that the subgroups can be defined by patterns of non‐genetic binary covariates measured on each sibpair. The data we consider consists of the squared difference of a quantitative trait measurement on each sibpair, estimates of identity‐by‐descent (IBD) values at each genetic marker, and binary covariate data describing characteristics of the sibpair (e.g., race, sex, family history of disease). To test the efficacy of this method in linkage analysis, we performed two simulation experiments. In the first, we simulated a mixture consisting of 66.6% of the sibpairs with no linkage and 33.3% of the sibpairs with genetic linkage to one marker. The two groups were distinguished by the value of a single binary covariate. We also simulated one unlinked marker and one random covariate to include as noise in the data. In the second experiment, we simulated a mixture consisting of 55% of the sibpairs with no genetic linkage, 22.5% of the sibpairs with genetic linkage to one marker, and 22.5% of the sibpairs with linkage to a different marker. Each subgroup was defined by a distinct pattern of two binary covariates. We also simulated one unlinked marker and two random covariates to include as noise in the data. Our simulation studies found that we can significantly increase the overall power to detect linkage by fitting Haseman‐Elston regression models to homogeneous subgroups with only a small increase in the false‐positive rate. Second, the splitting rule can correctly identify important covariates and linked markers. Third, recursive partitioning of sibpair data using this splitting rule can correctly identify sibpair subgroups. These results indicate that partitioning sibpairs into homogeneous subgroups is feasible and significantly increases the power to detect linkage, thus demonstrating the practical utility and potential this new methodology holds. Genet. Epidemiol. 20:293–306, 2001. © 2001 Wiley‐Liss, Inc.