Detecting adaptive convergent amino acid evolution

In evolutionary genomics, researchers have taken an interest in identifying substitutions that subtend convergent phenotypic adaptations. This is a difficult question that requires distinguishing foreground convergent substitutions that are involved in the convergent phenotype from background convergent substitutions. Those may be linked to other adaptations, may be neutral or may be the consequence of mutational biases. Furthermore, there is no generally accepted definition of convergent substitutions. Various methods that use different definitions have been proposed in the literature, resulting in different sets of candidate foreground convergent substitutions. In this article, we first describe the processes that can generate foreground convergent substitutions in coding sequences, separating adaptive from non-adaptive processes. Second, we review methods that have been proposed to detect foreground convergent substitutions in coding sequences and expose the assumptions that underlie them. Finally, we examine their power on simulations of convergent changes—including in the presence of a change in the efficacy of selection—and on empirical alignments. This article is part of the theme issue ‘Convergent evolution in the genomics era: new insights and directions'.

[1]  Todd A. Castoe,et al.  Evidence for an ancient adaptive episode of convergent molecular evolution , 2009, Proceedings of the National Academy of Sciences.

[2]  A. Halpern,et al.  Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. , 1998, Molecular biology and evolution.

[3]  Sudhir Kumar,et al.  Detection of convergent and parallel evolution at the amino acid sequence level. , 1997, Molecular biology and evolution.

[4]  Gregg W. C. Thomas,et al.  Determining the Null Model for Detecting Adaptive Convergence from Genomic Data: A Case Study using Echolocating Mammals. , 2015, Molecular biology and evolution.

[5]  L. Duret,et al.  Evidence for Widespread GC-biased Gene Conversion in Eukaryotes , 2012, Genome biology and evolution.

[6]  H. Moriyama,et al.  Divergent and parallel routes of biochemical adaptation in high-altitude passerine birds from the Qinghai-Tibet Plateau , 2018, Proceedings of the National Academy of Sciences.

[7]  Md. Shamsuzzoha Bayzid,et al.  Whole-genome analyses resolve early branches in the tree of life of modern birds , 2014, Science.

[8]  O. Gascuel,et al.  Phylogenetic mixture models for proteins , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[9]  Gang Li,et al.  The hearing gene Prestin reunites echolocating bats , 2008, Proceedings of the National Academy of Sciences.

[10]  L. Orlando,et al.  Less effective selection leads to larger genomes , 2017, Genome research.

[11]  Richard A. Goldstein,et al.  Identifying Changes in Selective Constraints: Host Shifts in Influenza , 2009, PLoS Comput. Biol..

[12]  Hervé Philippe,et al.  Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profiles , 2010, Proceedings of the National Academy of Sciences.

[13]  Nicolas C. Rochette,et al.  Bio++: efficient extensible libraries and tools for computational molecular evolution. , 2013, Molecular biology and evolution.

[14]  Anton Nekrutenko,et al.  Ten Simple Rules for Reproducible Computational Research , 2013, PLoS Comput. Biol..

[15]  Céline Scornavacca,et al.  OrthoMaM v8: a database of orthologous exons and coding sequences for comparative genomics in mammals. , 2014, Molecular biology and evolution.

[16]  Flavien Russier,et al.  Phylogenomics of C(4) photosynthesis in sedges (Cyperaceae): multiple appearances and genetic convergence. , 2009, Molecular biology and evolution.

[17]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[18]  R. Goldstein,et al.  Amino acid coevolution induces an evolutionary Stokes shift , 2012, Proceedings of the National Academy of Sciences.

[19]  Karl Pearson F.R.S. X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling , 2009 .

[20]  Laurent Duret,et al.  GC-Content Evolution in Bacterial Genomes: The Biased Gene Conversion Hypothesis Expands , 2014, bioRxiv.

[21]  Gilles Didier,et al.  Detecting the molecular basis of phenotypic convergence , 2018, Methods in Ecology and Evolution.

[22]  N. Lartillot,et al.  Correction: Molecular adaptation in Rubisco: Discriminating between convergent evolution and positive selection using mechanistic and classical codon models , 2018, PloS one.

[23]  Richard E. Lenski,et al.  Bacterial Population Negative Epistasis Between Beneficial Mutations in an Evolving , 2012 .

[24]  Nicolas Lartillot,et al.  PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating , 2009, Bioinform..

[25]  Richard A. Goldstein,et al.  Nonadaptive Amino Acid Convergence Rates Decrease over Time , 2015, Molecular biology and evolution.

[26]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[27]  E. Jarvis,et al.  Evidence for GC-biased gene conversion as a driver of between-lineage differences in avian base composition , 2014, Genome Biology.

[28]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[29]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[30]  Jianzhi Zhang,et al.  No genome-wide protein sequence convergence for echolocation. , 2015, Molecular biology and evolution.

[31]  P. Provero,et al.  Genome-wide signatures of convergent evolution in echolocating mammals , 2013, Nature.

[32]  J. A. Smith,et al.  Rubisco Evolution in C4 Eudicots: An Analysis of Amaranthaceae Sensu Lato , 2012, PloS one.

[33]  Ziheng Yang,et al.  Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage. , 2008, Molecular biology and evolution.

[34]  B. Boussau,et al.  Accurate Detection of Convergent Amino-Acid Evolution with PCOC , 2018, bioRxiv.

[35]  Molecular adaptation in Rubisco: discriminating between convergent evolution and positive selection using mechanistic and classical codon models , 2016 .

[36]  Daniel J. Wilson,et al.  A Population Genetics-Phylogenetics Approach to Inferring Natural Selection in Coding Sequences , 2011, PLoS genetics.