Fast and accurate inference of local ancestry in Latino populations

MOTIVATION It is becoming increasingly evident that the analysis of genotype data from recently admixed populations is providing important insights into medical genetics and population history. Such analyses have been used to identify novel disease loci, to understand recombination rate variation and to detect recent selection events. The utility of such studies crucially depends on accurate and unbiased estimation of the ancestry at every genomic locus in recently admixed populations. Although various methods have been proposed and shown to be extremely accurate in two-way admixtures (e.g. African Americans), only a few approaches have been proposed and thoroughly benchmarked on multi-way admixtures (e.g. Latino populations of the Americas). RESULTS To address these challenges we introduce here methods for local ancestry inference which leverage the structure of linkage disequilibrium in the ancestral population (LAMP-LD), and incorporate the constraint of Mendelian segregation when inferring local ancestry in nuclear family trios (LAMP-HAP). Our algorithms uniquely combine hidden Markov models (HMMs) of haplotype diversity within a novel window-based framework to achieve superior accuracy as compared with published methods. Further, unlike previous methods, the structure of our HMM does not depend on the number of reference haplotypes but on a fixed constant, and it is thereby capable of utilizing large datasets while remaining highly efficient and robust to over-fitting. Through simulations and analysis of real data from 489 nuclear trio families from the mainland US, Puerto Rico and Mexico, we demonstrate that our methods achieve superior accuracy compared with published methods for local ancestry inference in Latinos.

[1]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[2]  Jake K. Byrnes,et al.  Genomic Ancestry of North Africans Supports Back-to-Africa Migrations , 2012, PLoS genetics.

[3]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[4]  C. Hoggart,et al.  Design and analysis of admixture mapping studies. , 2004, American journal of human genetics.

[5]  Annette Lee,et al.  A genomewide single-nucleotide-polymorphism panel for Mexican American admixture mapping. , 2007, American journal of human genetics.

[6]  Chuong B. Do,et al.  Effect of genetic divergence in identifying ancestral origin using HAPAA. , 2008, Genome research.

[7]  Rui Mei,et al.  Latino populations: a unique opportunity for the study of race, genetics, and social environment in epidemiological research. , 2005, American journal of public health.

[8]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[9]  Xiaofeng Zhu,et al.  The landscape of recombination in African Americans , 2011, Nature.

[10]  E. Halperin,et al.  Estimating Local Ancestry in Admixed Populations , 2022 .

[11]  Rui Mei,et al.  Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data , 2010, PLoS genetics.

[12]  M. Stephens,et al.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. , 2003, Genetics.

[13]  H. Ostrer,et al.  Genome-wide patterns of population structure and admixture among Hispanic/Latino populations , 2010, Proceedings of the National Academy of Sciences.

[14]  N. Risch,et al.  Reconstructing genetic ancestry blocks in admixed individuals. , 2006, American journal of human genetics.

[15]  Xiaofeng Zhu,et al.  Linkage analysis of a complex disease through use of admixed populations. , 2004, American journal of human genetics.

[16]  Edwin K Silverman,et al.  Lower bronchodilator responsiveness in Puerto Rican than in Mexican subjects with asthma. , 2004, American journal of respiratory and critical care medicine.

[17]  R. Mei,et al.  A genomewide admixture mapping panel for Hispanic/Latino populations. , 2007, American journal of human genetics.

[18]  Gary K. Chen,et al.  Enhanced Statistical Tests for GWAS in Admixed Populations: Assessment using African Americans from CARe and a Breast Cancer Consortium , 2011, PLoS genetics.

[19]  Eran Halperin,et al.  Inference of locus-specific ancestry in closely related populations , 2009, Bioinform..

[20]  David Reich,et al.  A whole-genome admixture scan finds a candidate locus for multiple sclerosis susceptibility , 2005, Nature Genetics.

[21]  Ion I. Mandoiu,et al.  Genotype Error Detection Using Hidden Markov Models of Haplotype Diversity , 2007, WABI.

[22]  Nicholas A. Johnson,et al.  Ancestral Components of Admixed Genomes in a Mexican Cohort , 2011, PLoS genetics.

[23]  D. Reich,et al.  Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations , 2009, PLoS genetics.

[24]  A. Price,et al.  New approaches to disease mapping in admixed populations , 2011, Nature Reviews Genetics.

[25]  Ingo Ruczinski,et al.  Recombination rates in admixed individuals identified by ancestry-based inference , 2011, Nature Genetics.

[26]  Ion I. Mandoiu,et al.  Imputation-Based Local Ancestry Inference in Admixed Populations , 2009, ISBRA.

[27]  Ron Shamir,et al.  GERBIL: Genotype resolution and block identification using likelihood. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Serafim Batzoglou,et al.  Ancestry Inference in Complex Admixtures via Variable-Length Markov Chain Linkage Models , 2012, RECOMB.

[29]  M. Daly,et al.  Methods for high-density admixture mapping of disease genes. , 2004, American journal of human genetics.

[30]  D. Cox,et al.  A genomewide admixture map for Latino populations. , 2007, American journal of human genetics.

[31]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[32]  Rui Mei,et al.  Recent genetic selection in the ancestral admixture of Puerto Ricans. , 2007, American journal of human genetics.

[33]  Arturo Morales Carrión,et al.  Puerto Rico: A Political and Cultural History , 1983 .

[34]  M. Loh,et al.  Ancestry and pharmacogenomics of relapse in acute lymphoblastic leukemia , 2011, Nature Genetics.