A New Isolation with Migration Model along Complete Genomes Infers Very Different Divergence Processes among Closely Related Great Ape Species

We present a hidden Markov model (HMM) for inferring gradual isolation between two populations during speciation, modelled as a time interval with restricted gene flow. The HMM describes the history of adjacent nucleotides in two genomic sequences, such that the nucleotides can be separated by recombination, can migrate between populations, or can coalesce at variable time points, all dependent on the parameters of the model, which are the effective population sizes, splitting times, recombination rate, and migration rate. We show by extensive simulations that the HMM can accurately infer all parameters except the recombination rate, which is biased downwards. Inference is robust to variation in the mutation rate and the recombination rate over the sequence and also robust to unknown phase of genomes unless they are very closely related. We provide a test for whether divergence is gradual or instantaneous, and we apply the model to three key divergence processes in great apes: (a) the bonobo and common chimpanzee, (b) the eastern and western gorilla, and (c) the Sumatran and Bornean orang-utan. We find that the bonobo and chimpanzee appear to have undergone a clear split, whereas the divergence processes of the gorilla and orang-utan species occurred over several hundred thousands years with gene flow stopping quite recently. We also apply the model to the Homo/Pan speciation event and find that the most likely scenario involves an extended period of gene flow during speciation.

[1]  S. Tavare,et al.  A Note on Finite Homogeneous Continuous-Time Markov Chains , 1979 .

[2]  Churchill,et al.  A Markov Chain Model of Coalescence with Recombination , 1997, Theoretical population biology.

[3]  J. Hein,et al.  Recombination as a point process along sequences. , 1999, Theoretical population biology.

[4]  D. Haussler,et al.  Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  J. Thompson A model of the biogeographical journey from Proto-pan to Pan paniscus , 2003, Primates.

[6]  Carsten Wiuf,et al.  Gene Genealogies, Variation and Evolution - A Primer in Coalescent Theory , 2004 .

[7]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[8]  Jody Hey,et al.  Divergence population genetics of chimpanzees. , 2004, Molecular biology and evolution.

[9]  Thomas Mailund,et al.  CoaSim: A flexible environment for simulating genetic data under coalescent models , 2005, BMC Bioinformatics.

[10]  G. McVean,et al.  Approximating the coalescent with recombination , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[11]  Paul Marjoram,et al.  Fast "coalescent" simulation , 2006, BMC Genetics.

[12]  Eric S. Lander,et al.  Genetic evidence for complex speciation of humans and chimpanzees , 2006, Nature.

[13]  M. Steiper Population history, biogeography, and taxonomy of orangutans (Genus: Pongo) based on a population genetic meta-analysis of multiple loci. , 2006, Journal of human evolution.

[14]  M. Slatkin,et al.  The Concordance of Gene Trees and Species Trees at Two Linked Loci , 2006, Genetics.

[15]  J. Stankiewicz,et al.  A proposed drainage evolution model for Central Africa—Did the Congo flow east? , 2006 .

[16]  A. Hobolth,et al.  Genomic Relationships and Speciation Times of Human, Chimpanzee, and Gorilla Inferred from a Coalescent Hidden Markov Model , 2006, PLoS genetics.

[17]  Robert S. Harris,et al.  Improved pairwise alignment of genomic dna , 2007 .

[18]  S. Pääbo,et al.  The complex evolutionary history of gorillas: insights from genomic data. , 2006, Molecular biology and evolution.

[19]  M. Przeworski,et al.  A new approach to estimate parameters of speciation models with application to apes. , 2007, Genome research.

[20]  Ziheng Yang,et al.  Estimation of hominoid ancestral population sizes under bayesian coalescent models incorporating mutation rate variation and sequencing errors. , 2008, Molecular biology and evolution.

[21]  J. Dutheil,et al.  Non-homogeneous models of sequence evolution in the Bio++ suite of libraries and programs , 2008, BMC Evolutionary Biology.

[22]  J. Wakeley Complex speciation of humans and chimpanzees , 2008, Nature.

[23]  A. Hobolth,et al.  Ancestral Population Genomics: The Coalescent Hidden Markov Model Approach , 2009, Genetics.

[24]  Gary K. Chen,et al.  Fast and flexible simulation of DNA sequence data. , 2008, Genome research.

[25]  S. Yi,et al.  Doubts about complex speciation between humans and chimpanzees. , 2009, Trends in ecology & evolution.

[26]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[27]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[28]  M. Webster Patterns of autosomal divergence between the human and chimpanzee genomes support an allopatric model of speciation. , 2009, Gene.

[29]  Thomas Mailund,et al.  HMMlib: A C++ Library for General Hidden Markov Models Exploiting Modern CPUs , 2010, 2010 Ninth International Workshop on Parallel and Distributed Methods in Verification, and Second International Workshop on High Performance Computational Systems Biology.

[30]  Philip L. F. Johnson,et al.  Genetic history of an archaic hominin group from Denisova Cave in Siberia , 2010, Nature.

[31]  Philip L. F. Johnson,et al.  A Draft Sequence of the Neandertal Genome , 2010, Science.

[32]  A. Gylfason,et al.  Fine-scale recombination rate differences between sexes, populations and individuals , 2010, Nature.

[33]  Ziheng Yang A Likelihood Ratio Test of Speciation with Gene Flow Using Genomic Sequence Data , 2010, Genome biology and evolution.

[34]  J. Hey The divergence of chimpanzee species and subspecies as revealed in multipopulation isolation-with-migration analyses. , 2010, Molecular biology and evolution.

[35]  J. Hey Isolation with migration models for more than two populations. , 2010, Molecular biology and evolution.

[36]  J. Hey,et al.  Estimating Divergence Parameters With Small Samples From a Large Number of Loci , 2010, Genetics.

[37]  A. Siepel,et al.  Bayesian inference of ancient human demography from individual genome sequences , 2011, Nature Genetics.

[38]  Joshua S. Paul,et al.  An Accurate Sequentially Markov Conditional Sampling Distribution for the Coalescent With Recombination , 2011, Genetics.

[39]  D. Reich,et al.  Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. , 2011, American journal of human genetics.

[40]  A. Hobolth,et al.  Estimating Divergence Time and Ancestral Effective Population Size of Bornean and Sumatran Orangutan Subspecies Using a Coalescent Hidden Markov Model , 2011, PLoS genetics.

[41]  Thomas Mailund,et al.  On Computing the Coalescence Time Density in an Isolation-With-Migration Model With Few Samples , 2011, Genetics.

[42]  Thomas Mailund,et al.  Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection. , 2011, Genome research.

[43]  August E. Woerner,et al.  Genetic evidence for archaic admixture in Africa , 2011, Proceedings of the National Academy of Sciences.

[44]  Albert J. Vilella,et al.  Comparative and demographic analysis of orang-utan genomes , 2011, Nature.

[45]  R. Durbin,et al.  Inference of human population history from individual whole-genome sequences. , 2011, Nature.

[46]  Martin Goodson,et al.  Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. , 2011, Genome research.

[47]  D. Reich,et al.  The Date of Interbreeding between Neandertals and Modern Humans , 2012, PLoS genetics.

[48]  James Mallet,et al.  Genomic islands of divergence in hybridizing Heliconius butterflies identified by large-scale targeted sequencing , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[49]  S. Steinberg,et al.  Rate of de novo mutations and the importance of father’s age to disease risk , 2012, Nature.

[50]  Michael Westergaard,et al.  Using Colored Petri Nets to Construct Coalescent Hidden Markov Models: Automatic Translation from Demographic Specifications to Efficient Inference Methods , 2012, Petri Nets.

[51]  Tianqi Zhu,et al.  Maximum likelihood implementation of an isolation-with-migration model with three species for testing speciation with gene flow. , 2012, Molecular biology and evolution.

[52]  Kevin E. Langergraber,et al.  Generation times in wild chimpanzees and gorillas suggest earlier divergence times in great ape and human evolution , 2012, Proceedings of the National Academy of Sciences.

[53]  M. Nachman,et al.  Genome‐wide architecture of reproductive isolation in a naturally occurring hybrid zone between Mus musculus musculus and M. m. domesticus , 2012, Molecular ecology.

[54]  R. Durbin,et al.  Revising the human mutation rate: implications for understanding human evolution , 2012, Nature Reviews Genetics.

[55]  Albert J. Vilella,et al.  Insights into hominid evolution from the gorilla genome sequence , 2012, Nature.

[56]  M. Nachman,et al.  Recombination rate variation and speciation: theoretical predictions and empirical results from rabbits and mice , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[57]  Sergey Koren,et al.  The bonobo genome compared with the chimpanzee and human genomes , 2012, Nature.

[58]  H. Innan,et al.  An autosomal analysis gives no genetic evidence for complex speciation of humans and chimpanzees. , 2012, Molecular biology and evolution.

[59]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2011 , 2011, Nucleic Acids Res..

[60]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..