Chapter 9 - Reconciliation Approaches to Determining HGT, Duplications, and Losses in Gene Trees

Abstract Bacterial genome content varies greatly, even between closely related species, due to processes such as gene duplication, loss, and horizontal gene transfer (HGT). New genes derived from duplication or HGT give rise to new molecular functions within microbial genomes. Therefore, an in-depth understanding of gene family evolution is fundamental to bacterial genome annotation and gene function prediction. The genomic content of ancestral bacterial species is also of interest. This unit provides a general introduction to gene and genome history reconciliation, as implemented in various statistical frameworks. Several real data analysis problems are used to illustrate the principles of different reconciliation techniques. Explicit analysis protocols for use with various software programs are included with the chapter.

[1]  Daniel J. G. Lahr,et al.  Estimating the timing of early eukaryotic diversification with multigene molecular clocks , 2011, Proceedings of the National Academy of Sciences.

[2]  Lawrence A. David,et al.  Rapid evolutionary innovation during an Archaean genetic expansion , 2011, Nature.

[3]  Mukul S. Bansal,et al.  Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees , 2014, Genome research.

[4]  Cédric Chauve,et al.  An Efficient Method for Exploring the Space of Gene Tree/Species Tree Reconciliations in a Probabilistic Framework , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  M. Stanhope,et al.  Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition , 2007, Genome Biology.

[6]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[7]  Olga K. Kamneva,et al.  Analysis of Genome Content Evolution in PVC Bacterial Super-Phylum: Assessment of Candidate Genes Associated with Cellular Organization and Lifestyle , 2012, Genome biology and evolution.

[8]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[9]  Luay Nakhleh,et al.  Parsimonious inference of hybridization in the presence of incomplete lineage sorting. , 2013, Systematic biology.

[10]  István Miklós,et al.  Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model , 2009, Molecular biology and evolution.

[11]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[12]  J. Lagergren,et al.  Simultaneous Bayesian gene tree reconstruction and reconciliation analysis , 2009, Proceedings of the National Academy of Sciences.

[13]  M. Nei,et al.  Relationships between gene trees and species trees. , 1988, Molecular biology and evolution.

[14]  H. Ochman,et al.  Lateral gene transfer and the nature of bacterial innovation , 2000, Nature.

[15]  C. Woese,et al.  Phylogenetic structure of the prokaryotic domain: The primary kingdoms , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Antonis Rokas,et al.  Phylogenetic Analysis of Protein Sequence Data Using the Randomized Axelerated Maximum Likelihood (RAXML) Program , 2011, Current protocols in molecular biology.

[17]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.

[18]  W. Ludwig,et al.  The Use of rRNA Gene Sequence Data in the Classification and Identification of Prokaryotes , 2011 .

[19]  J. McInerney,et al.  The public goods hypothesis for the evolution of life on Earth , 2011, Biology Direct.

[20]  Manolis Kellis,et al.  Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss , 2012, Bioinform..

[21]  J. Lawrence,et al.  Phylogenetic incongruence arising from fragmented speciation in enteric bacteria , 2010, Proceedings of the National Academy of Sciences.

[22]  J. Bergsten A review of long‐branch attraction , 2005, Cladistics : the international journal of the Willi Hennig Society.

[23]  B. Snel,et al.  Toward Automatic Reconstruction of a Highly Resolved Tree of Life , 2006, Science.

[24]  Miklós Csuös,et al.  Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood , 2010, Bioinform..

[25]  John Quackenbush,et al.  TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets , 2003, Bioinform..