A new test for detecting recent positive selection that is free from the confounding impacts of demography.

It has been a long-standing interest in evolutionary biology to search for the traces of recent positive Darwinian selection in organisms. However, such efforts have been severely hindered by the confounding signatures of demography. As a consequence, neutrality tests often lead to false inference of positive selection because they detect the deviation from the standard neutral model. Here, using the maximum frequency of derived mutations (MFDM) to examine the unbalanceness of the tree of a locus, I propose a statistical test that is analytically free from the confounding effects of varying population size and has a high statistical power (up to 90.5%) to detect recent positive selection. When compared with five well-known neutrality tests for detecting selection (i.e., Tajima's D test, Fu and Li's D test, Fay and Wu's H test, the E test, and the joint DH test), the MFDM test is indeed the only one free from the confounding impacts of bottlenecks and size expansions. Simulations based on wide-range parameters demonstrated that the MFDM test is robust to background selection, population subdivision, and admixture (including hidden population structure). Moreover, when two high-frequency mutations are introduced, the MFDM test is robust to the misinference of derived and ancestral variants of segregating sites due to multiple hits. Finally, the sensitivity of the MFDM test in detecting balancing selection is also discussed. In summary, it is demonstrated that summary statistics based on tree topology can be used to detect selection, and this work provides a reliable method that can distinguish selection from demography even when DNA polymorphism data from only one locus is available.

[1]  L. Partridge,et al.  Oxford Surveys in Evolutionary Biology , 1991 .

[2]  W. Stephan,et al.  Detecting a local signature of genetic hitchhiking along a recombining chromosome. , 2002, Genetics.

[3]  Justin C. Fay,et al.  Hitchhiking under positive Darwinian selection. , 2000, Genetics.

[4]  J. Wakeley,et al.  Gene genealogies in a metapopulation. , 2001, Genetics.

[5]  F. Depaulis,et al.  Effect of misoriented sites on neutrality tests with outgroup. , 2003, Genetics.

[6]  R. Hudson,et al.  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. , 1985, Genetics.

[7]  D. Balding,et al.  Identifying adaptive genetic divergence among populations from genome scans , 2004, Molecular ecology.

[8]  N L Kaplan,et al.  The "hitchhiking effect" revisited. , 1989, Genetics.

[9]  W. Stephan,et al.  Inferring the Demographic History and Rate of Adaptive Substitution in Drosophila , 2006, PLoS genetics.

[10]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[11]  Joshua M Akey,et al.  Genomic signatures of positive selection in humans and the limits of outlier approaches. , 2006, Genome research.

[12]  J. M. Smith,et al.  The hitch-hiking effect of a favourable gene. , 1974, Genetical research.

[13]  M. Shriver,et al.  Interrogating a high-density SNP map for signatures of natural selection. , 2002, Genome research.

[14]  N. Barton,et al.  Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. , 2000, Genetics.

[15]  F. Tajima Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. , 1989, Genetics.

[16]  C. Bustamante,et al.  Distinguishing Between Selective Sweeps and Demography Using DNA Polymorphism Data , 2005, Genetics.

[17]  J. Pritchard,et al.  A Map of Recent Positive Selection in the Human Genome , 2006, PLoS biology.

[18]  P. Sjödin,et al.  Polymorphism and Divergence at Three Duplicate Genes in Brassica nigra , 2008, Journal of Molecular Evolution.

[19]  Kai Zeng,et al.  Statistical Tests for Detecting Positive Selection by Utilizing High-Frequency Variants , 2006, Genetics.

[20]  F. Tajima The effect of change in population size on DNA polymorphism. , 1989, Genetics.

[21]  H. Grüneberg,et al.  Introduction to quantitative genetics , 1960 .

[22]  Carlos Bustamante,et al.  Genomic scans for selective sweeps using SNP data. , 2005, Genome research.

[23]  W. Stephan,et al.  Maximum-Likelihood Methods for Detecting Recent Positive Selection and Localizing the Selected Site in the Genome , 2005, Genetics.

[24]  Ryan D. Hernandez,et al.  Context-dependent mutation rates may cause spurious signatures of a fixation bias favoring higher GC-content in humans. , 2007, Molecular biology and evolution.

[25]  G. Achaz Frequency Spectrum Neutrality Tests: One for All and All for One , 2009, Genetics.

[26]  Matthew W. Hahn,et al.  Toward a Selection Theory of Molecular Evolution , 2008, Evolution; international journal of organic evolution.

[27]  Molly Przeworski,et al.  The signature of positive selection at randomly chosen loci. , 2002, Genetics.

[28]  M. Kimura The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. , 1969, Genetics.

[29]  R. Nielsen,et al.  Linkage Disequilibrium as a Signature of Selective Sweeps , 2004, Genetics.

[30]  Y. Fu,et al.  Statistical properties of segregating sites. , 1995, Theoretical population biology.

[31]  H. Innan,et al.  Detecting Local Adaptation Using the Joint Sampling of Polymorphism Data in the Parental and Derived Populations , 2008, Genetics.

[32]  R. Hudson Gene genealogies and the coalescent process. , 1990 .

[33]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[34]  F. Tajima Evolutionary relationship of DNA sequences in finite populations. , 1983, Genetics.

[35]  B. Charlesworth,et al.  The effect of deleterious mutations on neutral molecular variation. , 1993, Genetics.

[36]  Pardis C Sabeti,et al.  Detecting recent positive selection in the human genome from haplotype structure , 2002, Nature.

[37]  Pardis C Sabeti,et al.  Genome-wide detection and characterization of positive selection in human populations , 2007, Nature.

[38]  R. Hudson Two-locus sampling distributions and their application. , 2001, Genetics.

[39]  P. Sneath,et al.  Numerical Taxonomy , 1962, Nature.

[40]  Kevin R. Thornton,et al.  A New Approach for Using Genome Scans to Detect Recent Positive Selection in the Human Genome , 2007, PLoS biology.

[41]  S. Tavaré,et al.  Modern computational approaches for analysing molecular genetic variation data , 2006, Nature Reviews Genetics.

[42]  Y. Fu,et al.  Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. , 1997, Genetics.

[43]  Catriona MacCallum,et al.  Being Positive about Selection , 2006, PLoS biology.

[44]  W. Li,et al.  Statistical tests of neutrality of mutations. , 1993, Genetics.

[45]  W Stephan,et al.  A population genomic approach to map recent positive selection in model species , 2008, Molecular ecology.

[46]  Thomas Wiehe,et al.  The Effect of Strongly Selected Substitutions on Neutral Polymorphism: Analytical Results Based on Diffusion Theory , 1992 .

[47]  A. Monaco,et al.  Molecular evolution of FOXP2, a gene involved in speech and language , 2002, Nature.

[48]  P. Fearnhead,et al.  A coalescent-based method for detecting and estimating recombination from gene sequences. , 2002, Genetics.

[49]  Graham Coop,et al.  SelSim: a program to simulate population genetic data with natural selection and recombination , 2004, Bioinform..