Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations

Population stratification may confound the results of genetic association studies among unrelated individuals from admixed populations. Several methods have been proposed to estimate the ancestral information in admixed populations and used to adjust the population stratification in genetic association tests. We evaluate the performances of three different methods: maximum likelihood estimation, ADMIXMAP and Structure through various simulated data sets and real data from Latino subjects participating in a genetic study of asthma. All three methods provide similar information on the accuracy of ancestral estimates and control type I error rate at an approximately similar rate. The most important factor in determining accuracy of the ancestry estimate and in minimizing type I error rate is the number of markers used to estimate ancestry. We demonstrate that approximately 100 ancestry informative markers (AIMs) are required to obtain estimates of ancestry that correlate with correlation coefficients more than 0.9 with the true individual ancestral proportions. In addition, after accounting for the ancestry information in association tests, the excess of type I error rate is controlled at the 5% level when 100 markers are used to estimate ancestry. However, since the effect of admixture on the type I error rate worsens with sample size, the accuracy of ancestry estimates also needs to increase to make the appropriate correction. Using data from the Latino subjects, we also apply these methods to an association study between body mass index and 44 AIMs. These simulations are meant to provide some practical guidelines for investigators conducting association studies in admixed populations.

[1]  Edwin K Silverman,et al.  Lower bronchodilator responsiveness in Puerto Rican than in Mexican subjects with asthma. , 2004, American journal of respiratory and critical care medicine.

[2]  G A Satten,et al.  Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. , 2001, American journal of human genetics.

[3]  L. Cardon,et al.  Association study designs for complex diseases , 2001, Nature Reviews Genetics.

[4]  D. Rao,et al.  Relationship of prevalence of non‐insulin‐dependent diabetes mellitus to Amerindian admixture in the Mexican Americans of San Antonio, Texas , 1986, Genetic epidemiology.

[5]  S. Wright,et al.  Evolution and the Genetics of Populations: Volume 2, The Theory of Gene Frequencies , 1968 .

[6]  E. Snyder,et al.  The Human Obesity Gene Map the Human Obesity Gene Map: the 2003 Update , 2022 .

[7]  R. Ward,et al.  Informativeness of genetic markers for inference of ancestry. , 2003, American journal of human genetics.

[8]  R. Williams,et al.  Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. , 1988, American journal of human genetics.

[9]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[10]  T. E. King Racial disparities in clinical trials. , 2002, The New England journal of medicine.

[11]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[12]  W J Schull,et al.  Individual admixture estimates: disease associations and individual risk of diabetes and gallbladder disease among Mexican-Americans in Starr County, Texas. , 1986, American journal of physical anthropology.

[13]  V. Heyward,et al.  Measures of body composition in blacks and whites: a comparative review. , 2000, The American journal of clinical nutrition.

[14]  K. Roeder,et al.  The power of genomic control. , 2000, American journal of human genetics.

[15]  J. Pritchard,et al.  Use of unlinked genetic markers to detect population stratification in association studies. , 1999, American journal of human genetics.

[16]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[17]  Mark D Shriver,et al.  Control of confounding of genetic associations in stratified populations. , 2003, American journal of human genetics.

[18]  N. Risch,et al.  The importance of race and ethnic background in biomedical research and clinical practice. , 2003, The New England journal of medicine.

[19]  Xiaofeng Zhu,et al.  On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals , 2003, Genetic epidemiology.

[20]  E. Lander,et al.  Genetic dissection of complex traits science , 1994 .

[21]  Elad Ziv,et al.  Human population structure and genetic association studies. , 2003, Pharmacogenomics.

[22]  J. Carpenter,et al.  Estimation of admixture and detection of linkage in admixed populations by a Bayesian approach: application to African-American populations. , 2000, Annals of human genetics.

[23]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[24]  Xiaofeng Zhu,et al.  Association mapping, using a mixture model for complex traits , 2002, Genetic epidemiology.

[25]  S. Wright,et al.  Evolution of the Genetics of Populations, Vol. 2. The Theory of Gene Frequencies , 1971 .

[26]  P. Donnelly,et al.  The effects of human population structure on large genetic association studies , 2004, Nature Genetics.

[27]  C. Hoggart,et al.  Design and analysis of admixture mapping studies. , 2004, American journal of human genetics.

[28]  L. Wasserman,et al.  Genomic control, a new approach to genetic-based association studies. , 2001, Theoretical population biology.

[29]  W. Ewens,et al.  Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). , 1993, American journal of human genetics.

[30]  M. Shriver,et al.  Admixture in the Hispanics of the San Luis Valley, Colorado, and its implications for complex trait gene mapping , 2004, Annals of human genetics.

[31]  N. Risch,et al.  Estimation of individual admixture: Analytical and study design considerations , 2005, Genetic epidemiology.

[32]  S. Zhang,et al.  Quantitative similarity-based association tests using population samples. , 2001, American journal of human genetics.

[33]  W. Ewens Evolution and the Genetics of Populations. Vol. 2, The Theory of Gene Frequencies. Sewall Wright. University of Chicago Press, Chicago, 1969. viii + 512 pp., illus. $15 , 1970 .