Likelihood and Bayes estimation of ancestral population sizes in hominoids using data from multiple loci.

Polymorphisms in an ancestral population can cause conflicts between gene trees and the species tree. Such conflicts can be used to estimate ancestral population sizes when data from multiple loci are available. In this article I extend previous work for estimating ancestral population sizes to analyze sequence data from three species under a finite-site nucleotide substitution model. Both maximum-likelihood (ML) and Bayes methods are implemented for joint estimation of the two speciation dates and the two population size parameters. Both methods account for uncertainties in the gene tree due to few informative sites at each locus and make an efficient use of information in the data. The Bayes algorithm using Markov chain Monte Carlo (MCMC) enjoys a computational advantage over ML and also provides a framework for incorporating prior information about the parameters. The methods are applied to a data set of 53 nuclear noncoding contigs from human, chimpanzee, and gorilla published by Chen and Li. Estimates of the effective population size for the common ancestor of humans and chimpanzees by both ML and Bayes methods are approximately 12,000-21,000, comparable to estimates for modern humans, and do not support the notion of a dramatic size reduction in early human populations. Estimates published previously from the same data are several times larger and appear to be biased due to methodological deficiency. The divergence between humans and chimpanzees is dated at approximately 5.2 million years ago and the gorilla divergence 1.1-1.7 million years earlier. The analysis suggests that typical data sets contain useful information about the ancestral population sizes and that it is advantageous to analyze data of several species simultaneously.

[1]  R. Hudson Gene trees, species trees and the segregation of ancestral alleles. , 1992, Genetics.

[2]  Chung-I Wu,et al.  Inferences of species phylogeny in relation to segregation of ancient polymorphisms. , 1991, Genetics.

[3]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[4]  Feng-Chi Chen,et al.  Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. , 2001, American journal of human genetics.

[5]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[6]  J. Klein,et al.  DNA archives and our nearest relative: the trichotomy problem revisited. , 2000, Molecular phylogenetics and evolution.

[7]  M. Ruvolo,et al.  Molecular phylogeny of the hominoids: inferences from multiple independent DNA sequence data sets. , 1997, Molecular biology and evolution.

[8]  Wen-Hsiung Li,et al.  Global patterns of human DNA sequence variation in a 10-kb region on chromosome 1. , 2001, Molecular biology and evolution.

[9]  M. Nachman,et al.  Estimate of the mutation rate per nucleotide in humans. , 2000, Genetics.

[10]  Z. Yang On the estimation of ancestral population sizes of modern humans. , 1997, Genetical research.

[11]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[12]  J. Hacia,et al.  Genome of the apes. , 2001, Trends in genetics : TIG.

[13]  Ziheng Yang Statistical Properties of the Maximum Likelihood Method of Phylogenetic Estimation and Comparison With Distance Matrix Methods , 1994 .

[14]  M. Kreitman,et al.  Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster , 1983, Nature.

[15]  P. Lio’,et al.  Models of molecular evolution and phylogeny. , 1998, Genome research.

[16]  N. M. Brooke,et al.  A molecular timescale for vertebrate evolution , 1998, Nature.

[17]  Y. Fu,et al.  A phylogenetic estimator of effective population size or mutation rate. , 1994, Genetics.

[18]  M. Nei Molecular Evolutionary Genetics , 1987 .

[19]  S. Edwards,et al.  GENE DIVERGENCE , POPULATION DIVERGENCE , AND THE VARIANCE IN COALESCENCE TIME IN PHYLOGEOGRAPHIC STUDIES , 2001 .

[20]  J. Klein,et al.  Divergence time and population size in the lineage leading to modern humans. , 1995, Theoretical population biology.

[21]  Z. Yang,et al.  Estimation of primate speciation dates using local molecular clocks. , 2000, Molecular biology and evolution.

[22]  S. Pääbo,et al.  Great ape DNA sequences reveal a reduced diversity and an expansion in humans , 2001, Nature Genetics.

[23]  N. Takahata An attempt to estimate the effective size of the ancestral species common to two extant species from which homologous genes are sequenced. , 1986, Genetical research.

[24]  L. Jin,et al.  Worldwide Dna Sequence Variation in a 10-kilobase Noncoding Region on Human Chromosome 22 Materials and Methods Dna Samples. Sixty-four Individuals Were Collected Worldwide from 16 Populations in Four Major Geographic Areas, including 20 , 2022 .