Reconstruction of ancestral gene orders using intermediate genomes

BackgroundThe problem of reconstructing ancestral genomes in a given phylogenetic tree arises in many different comparative genomics fields. Here, we focus on reconstructing the gene order of ancestral genomes, a problem that has been largely studied in the past 20 years, especially with the increasing availability of whole genome DNA sequences. There are two main approaches to this problem: event-based methods, that try to find the ancestral genomes that minimize the number of rearrangement events in the tree; and homology-based, that look for conserved structures, such as adjacent genes in the extant genomes, to build the ancestral genomes.ResultsWe propose algorithms that use the concept of intermediate genomes, arising in optimal pairwise rearrangement scenarios. We show that intermediate genomes have combinatorial properties that make them easy to reconstruct, and develop fast algorithms with better reconstructed ancestral genomes than current event-based methods. The proposed framework is also designed to accept extra information, such as results from homology-based approaches, giving rise to combined algorithms with better results than the original methods.

[1]  Phillip E. C. Compeau DCJ-Indel sorting revisited , 2012, Algorithms for Molecular Biology.

[2]  Ming Sun,et al.  Plasmids are vectors for redundant chromosomal genes in the Bacillus cereus group , 2014, BMC Genomics.

[3]  Shuigeng Zhou,et al.  A comparison study on feature selection of DNA structural properties for promoter prediction , 2012, BMC Bioinformatics.

[4]  Jens Stoye,et al.  Double Cut and Join with Insertions and Deletions , 2011, J. Comput. Biol..

[5]  David Sankoff,et al.  Multiple Genome Rearrangement and Breakpoint Phylogeny , 1998, J. Comput. Biol..

[6]  David Rosenkranz,et al.  proTRAC - a software for probabilistic piRNA cluster detection, visualization and analysis , 2012, BMC Bioinformatics.

[7]  Jens Stoye,et al.  A Unifying View of Genome Rearrangements , 2006, WABI.

[8]  Bernard B. Suh,et al.  Reconstructing contiguous regions of an ancestral genome. , 2006, Genome research.

[9]  Geert Vandeweyer,et al.  CNV-WebStore: Online CNV Analysis, Storage and Interpretation , 2011, BMC Bioinformatics.

[10]  P. Pevzner,et al.  Breakpoint graphs and ancestral genome reconstructions. , 2009, Genome research.

[11]  Jaap Heringa,et al.  Structure and function analysis of flexible alignment regions in proteins , 2009, BMC Bioinformatics.

[12]  João Meidanis,et al.  SCJ: A Breakpoint-Like Distance that Simplifies Several Rearrangement Problems , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Meng Zhang,et al.  An Exact Solver for the DCJ Median Problem , 2008, Pacific Symposium on Biocomputing.

[14]  Anne Bergeron,et al.  Combinatorial Structure of Genome Rearrangements Scenarios , 2010, J. Comput. Biol..

[15]  David Sankoff,et al.  Medians seek the corners, and other conjectures , 2012, BMC Bioinformatics.

[16]  João Meidanis,et al.  Rearrangement-Based Phylogeny Using the Single-Cut-or-Join Operation , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Yu Lin,et al.  MLGO: phylogeny reconstruction and ancestral inference from gene-order data , 2014, BMC Bioinformatics.

[18]  Jens Stoye,et al.  The Solution Space of Sorting by DCJ , 2010, J. Comput. Biol..

[19]  Mathieu Blanchette,et al.  A flexible ancestral genome reconstruction method based on gapped adjacencies , 2012, BMC Bioinformatics.

[20]  Michael Hackenberg,et al.  ContDist: a tool for the analysis of quantitative gene and promoter properties , 2009, BMC Bioinformatics.

[21]  Bernard M. E. Moret,et al.  GASTS: Parsimony Scoring under Rearrangements , 2011, WABI.

[22]  Cédric Chauve,et al.  ANGES: reconstructing ANcestral GEnomeS maps , 2012, Bioinform..

[23]  Jean-Stéphane Varré,et al.  ProCARs: Progressive Reconstruction of Ancestral Gene Orders , 2014, BMC Genomics.

[24]  Jijun Tang,et al.  Reconstruction of Ancestral Gene Orders Using Probabilistic and Gene Encoding Approaches , 2014, PloS one.

[25]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[26]  Yu Lin,et al.  Maximum Likelihood Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Tree of 68 Eukaryotes , 2012, Pacific Symposium on Biocomputing.

[27]  David Sankoff,et al.  On the PATHGROUPS approach to rapid small phylogeny , 2011, BMC Bioinformatics.

[28]  Krister M. Swenson,et al.  Maximum independent sets of commuting and noninterfering inversions , 2009, BMC Bioinformatics.

[29]  Andrew Wei Xu,et al.  The Median Problems on Linear Multichromosomal Genomes: Graph Representation and Fast Exact Solutions , 2010, J. Comput. Biol..

[30]  Krister M. Swenson,et al.  Inversion-based genomic signatures , 2009, BMC Bioinformatics.

[31]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[32]  N. J. A. Sloane,et al.  The On-Line Encyclopedia of Integer Sequences , 2003, Electron. J. Comb..