Loter: A Software Package to Infer Local Ancestry for a Wide Range of Species

Abstract Admixture between populations provides opportunity to study biological adaptation and phenotypic variation. Admixture studies rely on local ancestry inference for admixed individuals, which consists of computing at each locus the number of copies that originate from ancestral source populations. Existing software packages for local ancestry inference are tuned to provide accurate results on human data and recent admixture events. Here, we introduce Loter, an open-source software package that does not require any biological parameter besides haplotype data in order to make local ancestry inference available for a wide range of species. Using simulations, we compare the performance of Loter to HAPMIX, LAMP-LD, and RFMix. HAPMIX is the only software severely impacted by imperfect haplotype reconstruction. Loter is the less impacted software by increasing admixture time when considering simulated and admixed human genotypes. For simulations of admixed Populus genotypes, Loter and LAMP-LD are robust to increasing admixture times by contrast to RFMix. When comparing length of reconstructed and true ancestry tracts, Loter and LAMP-LD provide results whose accuracy is again more robust than RFMix to increasing admixture times. We apply Loter to individuals resulting from admixture between Populus trichocarpa and Populus balsamifera and lengths of ancestry tracts indicate that admixture took place ∼100 generations ago. We expect that providing a rapid and parameter-free software for local ancestry inference will make more accessible genomic studies about admixture processes.

[1]  M. Waterman,et al.  A dynamic programming algorithm for haplotype block partitioning , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[2]  L. Rieseberg,et al.  A genomic perspective on hybridization and speciation , 2016, Molecular ecology.

[3]  R. Kays,et al.  Admixture mapping identifies introgressed genomic regions in North American canids , 2016, Molecular ecology.

[4]  Eran Halperin,et al.  Inference of locus-specific ancestry in closely related populations , 2009, Bioinform..

[5]  D. Macaya-Sanz,et al.  Admixture mapping of quantitative traits in Populus hybrid zones: power and limitations , 2013, Heredity.

[6]  Length Distribution of Ancestral Tracks under a General Admixture Model and Its Applications in Population History Inference , 2016, Scientific reports.

[7]  S. Gravel Population Genetics Models of Local Ancestry , 2012, Genetics.

[8]  C. Bustamante,et al.  RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. , 2013, American journal of human genetics.

[9]  M. Hufford,et al.  The Genomic Signature of Crop-Wild Introgression in Maize , 2012, PLoS genetics.

[10]  E. Halperin,et al.  Estimating Local Ancestry in Admixed Populations , 2022 .

[11]  Jun Wang,et al.  Population Genomics Reveal Recent Speciation and Rapid Evolutionary Adaptation in Polar Bears , 2014, Cell.

[12]  J. Mairal,et al.  Loter: A software package to infer local ancestry for a wide range of species , 2017, bioRxiv.

[13]  Russell B. Corbett-Detig,et al.  A Hidden Markov Model Approach for Simultaneously Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in Samples of Arbitrary Ploidy , 2016, bioRxiv.

[14]  G. Perry,et al.  The impact of agricultural emergence on the genetic history of African rainforest hunter-gatherers and agriculturalists , 2014, Nature Communications.

[15]  B. Browning,et al.  Haplotype phasing: existing methods and new developments , 2011, Nature Reviews Genetics.

[16]  N. Patterson,et al.  Estimating and interpreting FST: The impact of rare variants , 2013, Genome research.

[17]  A. Price,et al.  New approaches to disease mapping in admixed populations , 2011, Nature Reviews Genetics.

[18]  Roded Sharan,et al.  Identifying Blocks and Sub-populations in Noisy SNP Data , 2003, WABI.

[19]  G. Turner,et al.  Whole-genome sequences of Malawi cichlids reveal multiple radiations interconnected by gene flow , 2018, Nature Ecology & Evolution.

[20]  C. Buerkle,et al.  Admixture as the basis for genetic mapping. , 2008, Trends in ecology & evolution.

[21]  M. Stephens,et al.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. , 2003, Genetics.

[22]  Itsik Pe'er,et al.  The time and place of European admixture in Ashkenazi Jewish history , 2016, bioRxiv.

[23]  Pedro C. Avila,et al.  Fast and accurate inference of local ancestry in Latino populations , 2012, Bioinform..

[24]  D. Reich,et al.  Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations , 2009, PLoS genetics.

[25]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[26]  David Reich,et al.  The Genetic Ancestry of African Americans, Latinos, and European Americans across the United States , 2015, American journal of human genetics.

[27]  Michael I. Jordan,et al.  On the Inference of Ancestries in Admixed Populations , 2008, RECOMB.

[28]  Lex E. Flagel,et al.  Speciation and Introgression between Mimulus nasutus and Mimulus guttatus , 2013, bioRxiv.

[29]  Yongtao Guan,et al.  Correction: Strong Selection at MHC in Mexicans since Admixture , 2016, PLoS genetics.

[30]  M. Daly,et al.  Methods for high-density admixture mapping of disease genes. , 2004, American journal of human genetics.

[31]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[32]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[33]  C. Hefer,et al.  Genomic and functional approaches reveal a case of adaptive introgression from Populus balsamifera (balsam poplar) in P. trichocarpa (black cottonwood) , 2016, Molecular ecology.

[34]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[35]  H. Blum,et al.  Whole-genome analysis of introgressive hybridization and characterization of the bovine legacy of Mongolian yaks , 2017, Nature Genetics.

[36]  D. Reich,et al.  The genetic ancestry of African, Latino, and European Americans across the United States , 2014, bioRxiv.