Genetic Structure, Self-identified Race/ethnicity, and Confounding in Case-control Association Studies

We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multiethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with self-identified race/ethnicity--as opposed to current residence--is the major determinant of genetic structure in the U.S. population. Implications of this genetic structure for case-control association studies are discussed.

[1]  John S Witte,et al.  Point: population stratification: a problem for case-control studies of candidate-gene associations? , 2002, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[2]  J. Craig Venter,et al.  FDA Races in Wrong Direction , 2003, Science.

[3]  Hua Tang,et al.  Categorization of humans in biomedical research: genes, race and disease , 2002, Genome Biology.

[4]  B S Weir,et al.  Estimation of the coancestry coefficient: basis for a short-term genetic distance. , 1983, Genetics.

[5]  David B. Goldstein,et al.  Population genetic structure of variable drug response , 2001, Nature Genetics.

[6]  Michael J Bamshad,et al.  Human population genetic structure and inference of group membership. , 2003, American journal of human genetics.

[7]  Steven C. Hunt,et al.  Multi-center genetic study of hypertension: The Family Blood Pressure Program (FBPP). , 2002, Hypertension.

[8]  D. Hewett‐Emmett,et al.  Origins of u.s. Hispanics: Implications for Diabetes , 1991, Diabetes Care.

[9]  Nathaniel Rothman,et al.  Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer. , 2002, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[10]  N. Risch,et al.  The importance of race and ethnic background in biomedical research and clinical practice. , 2003, The New England journal of medicine.

[11]  L. Cavalli-Sforza,et al.  Multilocus genotypes, a tree of individuals, and human evolutionary history. , 1997, American journal of human genetics.

[12]  D. Allison,et al.  Estimating African American admixture proportions by use of population-specific alleles. , 1998, American journal of human genetics.

[13]  L. Cavalli-Sforza,et al.  High resolution of human evolutionary trees with polymorphic microsatellites , 1994, Nature.

[14]  E. Lander,et al.  Genetic dissection of complex traits science , 1994 .

[15]  S. Sherry,et al.  Patterns of human diversity, within and among continents, inferred from biallelic DNA polymorphisms. , 2002, Genome research.

[16]  M. Feldman,et al.  Genetic Structure of Human Populations , 2002, Science.

[17]  J Craig Venter,et al.  Genetics. FDA races in wrong direction. , 2003, Science.

[18]  R. Cooper,et al.  Race and genomics. , 2003, The New England journal of medicine.

[19]  R. Lewontin The Apportionment of Human Diversity , 1972 .

[20]  J. Stephens,et al.  Haplotype Variation and Linkage Disequilibrium in 313 Human Genes , 2001, Science.

[21]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[22]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[23]  W. Speed,et al.  Short tandem repeat polymorphism evolution in humans , 1998, European Journal of Human Genetics.