Inference of locus-specific ancestry in closely related populations

A characterization of the genetic variation of recently admixed populations may reveal historical population events, and is useful for the detection of single nucleotide polymorphisms (SNPs) associated with diseases through association studies and admixture mapping. Inference of locus-specific ancestry is key to our understanding of the genetic variation of such populations. While a number of methods for the inference of locus-specific ancestry are accurate when the ancestral populations are quite distant (e.g. African–Americans), current methods incur a large error rate when inferring the locus-specific ancestry in admixed populations where the ancestral populations are closely related (e.g. Americans of European descent). Results: In this work, we extend previous methods for the inference of locus-specific ancestry by the incorporation of a refined model of recombination events. We present an efficient dynamic programming algorithm to infer the locus-specific ancestries in this model, resulting in a method that attains improved accuracies; the improvement is most significant when the ancestral populations are closely related. An evaluation on a wide range of scenarios, including admixtures of the 52 population groups from the Human Genome Diversity Project demonstrates that locus-specific ancestry can indeed be accurately inferred in these admixtures using our method. Finally, we demonstrate that imputation methods can be improved by the incorporation of locus-specific ancestry, when applied to admixed populations. Availability: The implementation of the WINPOP model is available as part of the LAMP package at http://lamp.icsi.berkeley.edu/lamp Contact: heran@icsi.berkeley.edu

[1]  Ron Shamir,et al.  GERBIL: Genotype resolution and block identification using likelihood. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[2]  K. Shianna,et al.  Long-range LD can confound genome scans in admixed populations. , 2008, American journal of human genetics.

[3]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[4]  M. Daly,et al.  Methods for high-density admixture mapping of disease genes. , 2004, American journal of human genetics.

[5]  Heikki Mannila,et al.  Phasing genotypes using a hidden Markov model , 2008 .

[6]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[7]  Rui Mei,et al.  Recent genetic selection in the ancestral admixture of Puerto Ricans. , 2007, American journal of human genetics.

[8]  Amit R. Indap,et al.  Genes mirror geography within Europe , 2008, Nature.

[9]  N. Risch,et al.  Admixture mapping for hypertension loci with genome-scan markers , 2005, Nature Genetics.

[10]  Chuong B. Do,et al.  Effect of genetic divergence in identifying ancestral origin using HAPAA. , 2008, Genome research.

[11]  David Reich,et al.  A whole-genome admixture scan finds a candidate locus for multiple sclerosis susceptibility , 2005, Nature Genetics.

[12]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: dominant markers and null alleles , 2007, Molecular ecology notes.

[13]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[14]  E. Halperin,et al.  Estimating Local Ancestry in Admixed Populations , 2022 .

[15]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[16]  N. Risch,et al.  Estimation of individual admixture: Analytical and study design considerations , 2005, Genetic epidemiology.

[17]  Hong-Wen Deng,et al.  Analyses and Comparison of Accuracy of Different Genotype Imputation Methods , 2008, PloS one.

[18]  Ion I. Mandoiu,et al.  Genotype Error Detection Using Hidden Markov Models of Haplotype Diversity , 2007, WABI.

[19]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[20]  M. Stephens,et al.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. , 2003, Genetics.

[21]  Michael I. Jordan,et al.  On the Inference of Ancestries in Admixed Populations , 2008, RECOMB.

[22]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[23]  Arturo Morales Carrión,et al.  Puerto Rico: A Political and Cultural History , 1983 .

[24]  Rui Mei,et al.  Latino populations: a unique opportunity for the study of race, genetics, and social environment in epidemiological research. , 2005, American journal of public health.

[25]  C. Hoggart,et al.  Design and analysis of admixture mapping studies. , 2004, American journal of human genetics.

[26]  N. Risch,et al.  Reconstructing genetic ancestry blocks in admixed individuals. , 2006, American journal of human genetics.

[27]  M. Feldman,et al.  Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation , 2008 .

[28]  Philippa Marrack,et al.  A role for clonal inactivation in T cell tolerance to Mls-1a , 2008, Nature.