Estimating the Frequency of Horizontal Gene Transfer Using Phylogenetic Models of Gene Gain and Loss.

We analyze patterns of gene presence and absence in a maximum likelihood framework with rate parameters for gene gain and loss. Standard methods allow independent gains and losses in different parts of a tree. While losses of the same gene are likely to be frequent, multiple gains need to be considered carefully. A gene gain could occur by horizontal transfer or by origin of a gene within the lineage being studied. If a gene is gained more than once, then at least one of these gains must be a horizontal transfer. A key parameter is the ratio of gain to loss rates, a/v We consider the limiting case known as the infinitely many genes model, where a/v tends to zero and a gene cannot be gained more than once. The infinitely many genes model is used as a null model in comparison to models that allow multiple gains. Using genome data from cyanobacteria and archaea, it is found that the likelihood is significantly improved by allowing for multiple gains, but the average a/v is very small. The fraction of genes whose presence/absence pattern is best explained by multiple gains is only 15% in the cyanobacteria and 20% and 39% in two data sets of archaea. The distribution of rates of gene loss is very broad, which explains why many genes follow a treelike pattern of vertical inheritance, despite the presence of a significant minority of genes that undergo horizontal transfer.

[1]  P. Gajer,et al.  The Pangenome Structure of Escherichia coli: Comparative Genomic Analysis of E. coli Commensal and Pathogenic Isolates , 2008, Journal of bacteriology.

[2]  Filipa L. Sousa,et al.  Origins of major archaeal clades correspond to gene acquisitions from bacteria , 2014, Nature.

[3]  Sandhya Dwarkadas,et al.  Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference , 2002, Bioinform..

[4]  Wolfgang R. Hess,et al.  The Infinitely Many Genes Model for the Distributed Genome of Bacteria , 2012, Genome biology and evolution.

[5]  E. Koonin,et al.  Search for a 'Tree of Life' in the thicket of the phylogenetic forest , 2009, Journal of biology.

[6]  S. Whelan,et al.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. , 2001, Molecular biology and evolution.

[7]  Eugene V. Koonin,et al.  Evolution of microbes and viruses: a paradigm shift in evolutionary biology? , 2012, Front. Cell. Inf. Microbio..

[8]  W. Doolittle,et al.  Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events. , 2006, Genome research.

[9]  Peter Pfaffelhuber,et al.  The infinitely many genes model with horizontal gene transfer , 2013, 1301.6547.

[10]  Lawrence E. Page,et al.  Niche adaptation and genome expansion in the chlorophyll d-producing cyanobacterium Acaryochloris marina , 2008, Proceedings of the National Academy of Sciences.

[11]  H. Tettelin,et al.  The microbial pan-genome. , 2005, Current opinion in genetics & development.

[12]  Guy Perrière,et al.  Databases of homologous gene families for comparative genomics , 2009, BMC Bioinformatics.

[13]  Eugene V Koonin,et al.  Genome reduction as the dominant mode of evolution , 2013, BioEssays : news and reviews in molecular, cellular and developmental biology.

[14]  P. Higgs,et al.  Testing the infinitely many genes model for the evolution of the bacterial core genome and pangenome. , 2012, Molecular biology and evolution.

[15]  Matthew Spencer,et al.  A phylogenetic mixture model for gene family loss in parasitic bacteria. , 2009, Molecular biology and evolution.

[16]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[17]  W. Hess,et al.  The diversity of a distributed genome in bacterial populations , 2009, 0907.2572.

[18]  Leon Goldovsky,et al.  The net of life: reconstructing the microbial phylogenetic network. , 2005, Genome research.

[19]  István Miklós,et al.  Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model , 2009, Molecular biology and evolution.

[20]  Maxim Teslenko,et al.  MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space , 2012, Systematic biology.

[21]  C. Kurland,et al.  Horizontal gene transfer: A critical view , 2003 .

[22]  Pascal Lapierre,et al.  Estimating the size of the bacterial pan-genome. , 2009, Trends in genetics : TIG.

[23]  Tal Dagan,et al.  Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution , 2008, Proceedings of the National Academy of Sciences.

[24]  Maureen A. O’Malley,et al.  Prokaryotic evolution and the tree of life are two different things , 2009, Biology Direct.

[25]  J. Flegr,et al.  Microevolutionary, macroevolutionary, ecological and taxonomical implications of punctuational theories of adaptive evolution , 2013, Biology Direct.

[26]  M. Gouy,et al.  Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. , 1998, Molecular biology and evolution.

[27]  Jonathan P. Zehr,et al.  Metabolic streamlining in an open-ocean nitrogen-fixing cyanobacterium , 2010, Nature.

[28]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[29]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[30]  Tal Pupko,et al.  Inference and Characterization of Horizontally Transferred Gene Families Using Stochastic Mapping , 2009, Molecular biology and evolution.

[31]  Junhyong Kim,et al.  The Cobweb of Life Revealed by Genome-Scale Estimates of Horizontal Gene Transfer , 2005, PLoS biology.

[32]  A. Mushegian,et al.  Models of gene gain and gene loss for probabilistic reconstruction of gene content in the last universal common ancestor of life , 2013, Biology Direct.

[33]  Eugene V. Koonin,et al.  Gene Frequency Distributions Reject a Neutral Model of Genome Evolution , 2013, Genome biology and evolution.

[34]  B. Snel,et al.  Genome phylogeny based on gene content , 1999, Nature Genetics.

[35]  Radhey S. Gupta,et al.  Signature proteins for the major clades of Cyanobacteria , 2010, BMC Evolutionary Biology.

[36]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[37]  Alfred Pühler,et al.  Genomes of Stigonematalean Cyanobacteria (Subsection V) and the Evolution of Oxygenic Photosynthesis from Prokaryotes to Plastids , 2012, Genome biology and evolution.

[38]  W. Martin,et al.  Ancestral genome sizes specify the minimum rate of lateral gene transfer during prokaryote evolution , 2007, Proceedings of the National Academy of Sciences.

[39]  J. Townsend,et al.  Horizontal gene transfer, genome innovation and evolution , 2005, Nature Reviews Microbiology.

[40]  Radhey S. Gupta,et al.  Phylogenomic analysis of proteins that are distinctive of Archaea and its main subgroups and the origin of methanogenesis , 2007, BMC Genomics.

[41]  Matt Nolan,et al.  Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing , 2012, Proceedings of the National Academy of Sciences.

[42]  Michael Y. Galperin,et al.  Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes , 2003, BMC Evolutionary Biology.

[43]  Adi Stern,et al.  A likelihood framework to analyse phyletic patterns , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[44]  G. B. Golding,et al.  The fate of laterally transferred genes: life in the fast lane to adaptation or death. , 2006, Genome research.

[45]  Phylogenetic analysis and molecular signatures defining a monophyletic clade of heterocystous cyanobacteria and identifying its closest relatives , 2014, Photosynthesis Research.

[46]  Natalya Yutin,et al.  Updated clusters of orthologous genes for Archaea: a complex ancestor of the Archaea and the byways of horizontal gene transfer , 2012, Biology Direct.