A likelihood framework to analyse phyletic patterns

Probabilistic evolutionary models revolutionized our capability to extract biological insights from sequence data. While these models accurately describe the stochastic processes of site-specific substitutions, single-base substitutions represent only a fraction of all the events that shape genomes. Specifically, in microbes, events in which entire genes are gained (e.g. via horizontal gene transfer) and lost play a pivotal evolutionary role. In this research, we present a novel likelihood-based evolutionary model for gene gains and losses, and use it to analyse genome-wide patterns of the presence and absence of gene families. The model assumes a Markovian stochastic process, where gains and losses are represented by the transition between presence and absence, respectively, given an underlying phylogenetic tree. To account for differences in the rates of gain and loss of different gene families, we assume among-gene family rate variability, thus allowing for more accurate description of the data. Using the Bayesian approach, we estimated an evolutionary rate for each gene family. Simulation studies demonstrated that our methodology accurately infers these rates. Our methodology was applied to analyse a large corpus of data, consisting of 4873 gene families spanning 63 species and revealed novel insights regarding the evolutionary nature of genome-wide gain and loss dynamics.

[1]  Jan Boehm,et al.  Toward automatic reconstruction of interiors from laser data , 2009 .

[2]  N. Friedman,et al.  Natural history and evolutionary principles of gene duplication in fungi , 2007, Nature.

[3]  G. B. Golding,et al.  The role of laterally transferred genes in adaptive evolution , 2007, BMC Evolutionary Biology.

[4]  M. Spencer,et al.  Conditioned genome reconstruction: how to avoid choosing the conditioning genome. , 2007, Systematic biology.

[5]  G. B. Golding,et al.  The fate of laterally transferred genes: life in the fast lane to adaptation or death. , 2006, Genome research.

[6]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[7]  R. Nielsen,et al.  Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. , 2005, Molecular biology and evolution.

[8]  B. Snel,et al.  Genome trees and the nature of genome evolution. , 2005, Annual review of microbiology.

[9]  J. Townsend,et al.  Horizontal gene transfer, genome innovation and evolution , 2005, Nature Reviews Microbiology.

[10]  N. Moran,et al.  Evolutionary Origins of Genomic Repertoires in Bacteria , 2005, PLoS biology.

[11]  Chanathip Pharino,et al.  Genotypic Diversity Within a Natural Coastal Bacterioplankton Population , 2005, Science.

[12]  R. Doolittle,et al.  A naturally occurring horizontal gene transfer from a eukaryote to a prokaryote , 1990, Journal of Molecular Evolution.

[13]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[14]  N. Ben-Tal,et al.  Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. , 2004, Molecular biology and evolution.

[15]  Elizabeth Pennisi,et al.  Researchers Trade Insights About Gene Swapping , 2004, Science.

[16]  J. Lake,et al.  Deriving the genomic tree of life in the presence of horizontal gene transfer: conditioned reconstruction. , 2004, Molecular biology and evolution.

[17]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[18]  E. Pennisi Microbiology. Researchers trade insights about gene swapping. , 2004, Science.

[19]  J. Lake,et al.  Horizontal gene transfer accelerates genome innovation and evolution. , 2003, Molecular biology and evolution.

[20]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[21]  Michael Y. Galperin,et al.  Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes , 2003, BMC Evolutionary Biology.

[22]  S. Andersson,et al.  Microbial genome evolution: sources of variability. , 2002, Current opinion in microbiology.

[23]  Itay Mayrose,et al.  Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues , 2002, ISMB.

[24]  E. Koonin,et al.  Horizontal gene transfer in prokaryotes: quantification and classification. , 2001, Annual review of microbiology.

[25]  H. Ochman,et al.  Lateral gene transfer and the nature of bacterial innovation , 2000, Nature.

[26]  Z. Yang,et al.  Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. , 2000, Molecular biology and evolution.

[27]  S. Salzberg,et al.  Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima , 1999, Nature.

[28]  J. Lake,et al.  Horizontal gene transfer among genomes: the complexity hypothesis. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[29]  J R Roth,et al.  Selfish operons: horizontal transfer may drive the evolution of gene clusters. , 1996, Genetics.

[30]  Z. Yang,et al.  Mixed model analysis of DNA sequence evolution. , 1995, Biometrics.

[31]  Z. Yang,et al.  A space-time process model for the evolution of DNA sequences. , 1995, Genetics.

[32]  M. Syvanen Horizontal gene transfer: evidence and possible consequences. , 1994, Annual review of genetics.

[33]  Horizontal Gene Flow: Evidence and Possible Consequences , 1994 .

[34]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[35]  Joseph Felsenstein,et al.  PHYLOGENIES FROM RESTRICTION SITES: A MAXIMUM‐LIKELIHOOD APPROACH , 1992, Evolution; international journal of organic evolution.

[36]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[37]  J. Doob Stochastic processes , 1953 .