Multiway Admixture Deconvolution Using Phased or Unphased Ancestral Panels

We describe a novel method for inferring the local ancestry of admixed individuals from dense genome‐wide single nucleotide polymorphism data. The method, called MULTIMIX, allows multiple source populations, models population linkage disequilibrium between markers and is applicable to datasets in which the sample and source populations are either phased or unphased. The model is based upon a hidden Markov model of switches in ancestry between consecutive windows of loci. We model the observed haplotypes within each window using a multivariate normal distribution with parameters estimated from the ancestral panels. We present three methods to fit the model—Markov chain Monte Carlo sampling, the Expectation Maximization algorithm, and a Classification Expectation Maximization algorithm. The performance of our method on individuals simulated to be admixed with European and West African ancestry shows it to be comparable to HAPMIX, the ancestry calls of the two methods agreeing at 99.26% of loci across the three parameter groups. In addition to it being faster than HAPMIX, it is also found to perform well over a range of extent of admixture in a simulation involving three ancestral populations. In an analysis of real data, we estimate the contribution of European, West African and Native American ancestry to each locus in the Mexican samples of HapMap, giving estimates of ancestral proportions that are consistent with those previously reported.

[1]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[2]  O. Delaneau,et al.  A linear complexity phasing method for thousands of genomes , 2011, Nature Methods.

[3]  J. Marchini,et al.  Genotype Imputation with Thousands of Genomes , 2011, G3: Genes | Genomes | Genetics.

[4]  Ingo Ruczinski,et al.  Recombination rates in admixed individuals identified by ancestry-based inference , 2011, Nature Genetics.

[5]  Xiaofeng Zhu,et al.  The landscape of recombination in African Americans , 2011, Nature.

[6]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[7]  Matthew Stephens,et al.  USING LINEAR PREDICTORS TO IMPUTE ALLELE FREQUENCIES FROM SUMMARY OR POOLED GENOTYPE DATA. , 2010, The annals of applied statistics.

[8]  David B. Witonsky,et al.  Using Environmental Correlations to Identify Loci Underlying Local Adaptation , 2010, Genetics.

[9]  H. Ostrer,et al.  Genome-wide patterns of population structure and admixture among Hispanic/Latino populations , 2010, Proceedings of the National Academy of Sciences.

[10]  E. Boerwinkle,et al.  Admixture Mapping of Obesity‐related Traits in African Americans: The Atherosclerosis Risk in Communities (ARIC) Study , 2010, Obesity.

[11]  Stephen L. Hauser,et al.  Genome-wide patterns of population structure and admixture in West Africans and African Americans , 2009, Proceedings of the National Academy of Sciences.

[12]  William J. Astle,et al.  Population Structure and Cryptic Relatedness in Genetic Association Studies , 2009, 1010.4681.

[13]  Rahul C. Deo,et al.  An Admixture Scan in 1,484 African American Women with Breast Cancer , 2009, Cancer Epidemiology, Biomarkers & Prevention.

[14]  David H. Alexander,et al.  Fast model-based estimation of ancestry in unrelated individuals. , 2009, Genome research.

[15]  D. Reich,et al.  Results from a prostate cancer admixture mapping study in African-American men , 2009, Human Genetics.

[16]  D. Reich,et al.  Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations , 2009, PLoS genetics.

[17]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[18]  Eran Halperin,et al.  Inference of locus-specific ancestry in closely related populations , 2009, Bioinform..

[19]  Ching-Yu Cheng,et al.  Admixture Mapping of 15,280 African Americans Identifies Obesity Susceptibility Loci on Chromosomes 5 and X , 2009, PLoS genetics.

[20]  C. Hanis,et al.  Genome-Wide Linkage and Admixture Mapping of Type 2 Diabetes in African American Families From the American Diabetes Association GENNID (Genetics of NIDDM) Study Cohort , 2009, Diabetes.

[21]  D. Reich,et al.  MYH9 is associated with nondiabetic end-stage renal disease in African Americans , 2008, Nature Genetics.

[22]  K. Shianna,et al.  Long-range LD can confound genome scans in admixed populations. , 2008, American journal of human genetics.

[23]  Chuong B. Do,et al.  Effect of genetic divergence in identifying ancestral origin using HAPAA. , 2008, Genome research.

[24]  Michael I. Jordan,et al.  On the Inference of Ancestries in Admixed Populations , 2008, RECOMB.

[25]  E. Halperin,et al.  Estimating Local Ancestry in Admixed Populations , 2022 .

[26]  R. Cooper,et al.  Admixture Mapping Provides Evidence of Association of the VNN1 Gene with Hypertension , 2007, PloS one.

[27]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[28]  A. Whittemore,et al.  Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men , 2006, Proceedings of the National Academy of Sciences.

[29]  N. Risch,et al.  Reconstructing genetic ancestry blocks in admixed individuals. , 2006, American journal of human genetics.

[30]  N. Risch,et al.  Estimation of individual admixture: Analytical and study design considerations , 2005, Genetic epidemiology.

[31]  N. Risch,et al.  Admixture mapping for hypertension loci with genome-scan markers , 2005, Nature Genetics.

[32]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[33]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[34]  M. Daly,et al.  Methods for high-density admixture mapping of disease genes. , 2004, American journal of human genetics.

[35]  M. Stephens,et al.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. , 2003, Genetics.

[36]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[37]  A. Merriwether The evolution and genetics of Latin American populations , 2003 .

[38]  Peter Donnelly,et al.  Assessing population differentiation and isolation from single‐nucleotide polymorphism data , 2002 .

[39]  Jonathan Scott Friedlaender,et al.  A Human Genome Diversity Cell Line Panel , 2002, Science.

[40]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[41]  R. Cann The history and geography of human genes , 1995, The Journal of Asian Studies.

[42]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .