Analysis of Latino populations from GALA and MEC studies reveals genomic loci with biased local ancestry estimation

MOTIVATION Local ancestry analysis of genotype data from recently admixed populations (e.g. Latinos, African Americans) provides key insights into population history and disease genetics. Although methods for local ancestry inference have been extensively validated in simulations (under many unrealistic assumptions), no empirical study of local ancestry accuracy in Latinos exists to date. Hence, interpreting findings that rely on local ancestry in Latinos is challenging. RESULTS Here, we use 489 nuclear families from the mainland USA, Puerto Rico and Mexico in conjunction with 3204 unrelated Latinos from the Multiethnic Cohort study to provide the first empirical characterization of local ancestry inference accuracy in Latinos. Our approach for identifying errors does not rely on simulations but on the observation that local ancestry in families follows Mendelian inheritance. We measure the rate of local ancestry assignments that lead to Mendelian inconsistencies in local ancestry in trios (MILANC), which provides a lower bound on errors in the local ancestry estimates. We show that MILANC rates observed in simulations underestimate the rate observed in real data, and that MILANC varies substantially across the genome. Second, across a wide range of methods, we observe that loci with large deviations in local ancestry also show enrichment in MILANC rates. Therefore, local ancestry estimates at such loci should be interpreted with caution. Finally, we reconstruct ancestral haplotype panels to be used as reference panels in local ancestry inference and show that ancestry inference is significantly improved by incoroprating these reference panels. AVAILABILITY AND IMPLEMENTATION We provide the reconstructed reference panels together with the maps of MILANC rates as a public resource for researchers analyzing local ancestry in Latinos at http://bogdanlab.pathology.ucla.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Edwin K Silverman,et al.  Lower bronchodilator responsiveness in Puerto Rican than in Mexican subjects with asthma. , 2004, American journal of respiratory and critical care medicine.

[2]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[3]  Noah A. Rosenberg,et al.  A General Mechanistic Model for Admixture Histories of Hybrid Populations , 2011, Genetics.

[4]  Rui Mei,et al.  Recent genetic selection in the ancestral admixture of Puerto Ricans. , 2007, American journal of human genetics.

[5]  Gary K. Chen,et al.  Enhanced Statistical Tests for GWAS in Admixed Populations: Assessment using African Americans from CARe and a Breast Cancer Consortium , 2011, PLoS genetics.

[6]  Pedro C. Avila,et al.  Fast and accurate inference of local ancestry in Latino populations , 2012, Bioinform..

[7]  A. Price,et al.  New approaches to disease mapping in admixed populations , 2011, Nature Reviews Genetics.

[8]  Serafim Batzoglou,et al.  Ancestry Inference in Complex Admixtures via Variable-length Markov Chain Linkage Models , 2013, J. Comput. Biol..

[9]  M. Feldman,et al.  Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation , 2008 .

[10]  Eran Halperin,et al.  Inference of locus-specific ancestry in closely related populations , 2009, Bioinform..

[11]  J. Martínez-Cruzado,et al.  Reconstructing the population history of Puerto Rico by means of mtDNA phylogeographic analysis. , 2005, American journal of physical anthropology.

[12]  Guanjie Chen,et al.  Mapping of disease-associated variants in admixed populations , 2011, Genome Biology.

[13]  Gary K. Chen,et al.  Admixture mapping identifies a locus on 6q25 associated with breast cancer risk in US Latinas. , 2012, Human molecular genetics.

[14]  M. Loh,et al.  Ancestry and pharmacogenomics of relapse in acute lymphoblastic leukemia , 2011, Nature Genetics.

[15]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[16]  Larsson Omberg,et al.  Patterns of Ancestry, Signatures of Natural Selection, and Genetic Association with Stature in Western African Pygmies , 2012, PLoS genetics.

[17]  D. Reich,et al.  Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations , 2009, PLoS genetics.

[18]  Chuong B. Do,et al.  Effect of genetic divergence in identifying ancestral origin using HAPAA. , 2008, Genome research.

[19]  Jake K. Byrnes,et al.  PCAdmix: Principal Components-Based Assignment of Ancestry Along Each Chromosome in Individuals with Admixed Ancestry from Two or More Populations , 2012, Human biology.

[20]  N. Risch,et al.  Reconstructing genetic ancestry blocks in admixed individuals. , 2006, American journal of human genetics.

[21]  Nicholas A. Johnson,et al.  Ancestral Components of Admixed Genomes in a Mexican Cohort , 2011, PLoS genetics.

[22]  B. Henderson,et al.  Generalizability of Associations from Prostate Cancer Genome-Wide Association Studies in Multiple Populations , 2009, Cancer Epidemiology Biomarkers & Prevention.

[23]  John Novembre,et al.  The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. , 2008, American journal of human genetics.

[24]  Yiping Shen,et al.  Genome-wide detection of natural selection in African Americans pre- and post-admixture. , 2012, Genome research.

[25]  A. Whittemore,et al.  Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men , 2006, Proceedings of the National Academy of Sciences.

[26]  F. Ayala,et al.  Genome-wide Patterns of Population Structure and Admixture Among Hispanic/Latino Populations , 2010 .

[27]  D O Stram,et al.  A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. , 2000, American journal of epidemiology.

[28]  Xiaofeng Zhu,et al.  The landscape of recombination in African Americans , 2011, Nature.

[29]  E. Halperin,et al.  Estimating Local Ancestry in Admixed Populations , 2022 .

[30]  Rui Mei,et al.  Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data , 2010, PLoS genetics.