Reconstructing ancestral gene orders with duplications guided by synteny level genome reconstruction

BackgroundReconstructing ancestral gene orders in the presence of duplications is important for a better understanding of genome evolution. Current methods for ancestral reconstruction are limited by either computational constraints or the availability of reliable gene trees, and often ignore duplications altogether. Recently, methods that consider duplications in ancestral reconstructions have been developed, but the quality of reconstruction, counted as the number of contiguous ancestral regions found, decreases rapidly with the number of duplicated genes, complicating the application of such approaches to mammalian genomes. However, such high fragmentation is not encountered when reconstructing mammalian genomes at the synteny-block level, although the relative positions of genes in such reconstruction cannot be recovered.ResultsWe propose a new heuristic method, MultiRes, to reconstruct ancestral gene orders with duplications guided by homologous synteny blocks for a set of related descendant genomes. The method uses a synteny-level reconstruction to break the gene-order problem into several subproblems, which are then combined in order to disambiguate duplicated genes. We applied this method to both simulated and real data. Our results showed that MultiRes outperforms other methods in terms of gene content, gene adjacency, and common interval recovery.ConclusionsThis work demonstrates that the inclusion of synteny-level information can help us obtain better gene-level reconstructions. Our algorithm provides a basic toolbox for reconstructing ancestral gene orders with duplications. The source code of MultiRes is available on https://github.com/ma-compbio/MultiRes.

[1]  Cédric Chauve,et al.  Reconstructing the architecture of the ancestral amniote genome , 2011, Bioinform..

[2]  D. Sankoff,et al.  Gene Order Breakpoint Evidence in Animal Mitochondrial Phylogeny , 1999, Journal of Molecular Evolution.

[3]  David A. Bader,et al.  A Linear-Time Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study , 2001, WADS.

[4]  Yu Lin,et al.  MLGO: phylogeny reconstruction and ancestral inference from gene-order data , 2014, BMC Bioinformatics.

[5]  Vineet Bafna,et al.  Cerulean: A Hybrid Assembly Using High Throughput Short and Long Reads , 2013, WABI.

[6]  Matthieu Muffato,et al.  Paleogenomics in vertebrates, or the recovery of lost genomes from the mist of time , 2008, BioEssays : news and reviews in molecular, cellular and developmental biology.

[7]  P. Pevzner,et al.  Dynamics of Mammalian Chromosome Evolution Inferred from Multispecies Comparative Maps , 2005, Science.

[8]  Miklós Csürös,et al.  Ancestral Reconstruction by Asymmetric Wagner Parsimony over Continuous Characters and Squared Parsimony over Distributions , 2008, RECOMB-CG.

[9]  E. Birney,et al.  Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. , 2008, Genome research.

[10]  P. Pevzner,et al.  Reconstructing the genomic architecture of ancestral mammals: lessons from human, mouse, and rat genomes. , 2004, Genome research.

[11]  Cédric Chauve,et al.  FPSAC: fast phylogenetic scaffolding of ancient contigs , 2013, Bioinform..

[12]  Cédric Chauve,et al.  Yeast Ancestral Genome Reconstructions: The Possibilities of Computational Methods II , 2010, J. Comput. Biol..

[13]  Amihood Amir,et al.  Improved approximate common interval , 2007, Inf. Process. Lett..

[14]  Viktor K. Jirsa,et al.  Distinct Timing Mechanisms Produce Discrete and Continuous Movements , 2008, PLoS Comput. Biol..

[15]  Jean-Stéphane Varré,et al.  ProCARs: Progressive Reconstruction of Ancestral Gene Orders , 2014, BMC Genomics.

[16]  Vineet Bafna,et al.  Genome rearrangements and sorting by reversals , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[17]  P A Pevzner,et al.  Genome sequence comparison and scenarios for gene rearrangements: a test case. , 1995, Genomics.

[18]  Bernard B. Suh,et al.  Reconstructing contiguous regions of an ancestral genome. , 2006, Genome research.

[19]  Ugur Dogrusoz,et al.  Combinatorial Pattern Matching: 15th Annual Symposium, CPM 2004, Istanbul, Turkey, July 5-7, 2004, Proceedings (Lecture Notes in Computer Science) , 2004 .

[20]  David Sankoff,et al.  The Median Problem for Breakpoints in Comparative Genomics , 1997, COCOON.

[21]  Shuai Jiang,et al.  Reconstruction of ancestral genomes in presence of gene gain and loss , 2016, bioRxiv.

[22]  Ming Sun,et al.  Plasmids are vectors for redundant chromosomal genes in the Bacillus cereus group , 2014, BMC Genomics.

[23]  Tao Liu,et al.  Inversion Medians Outperform Breakpoint Medians in Phylogeny Reconstruction from Gene-Order Data , 2002, WABI.

[24]  Cédric Chauve,et al.  ANGES: reconstructing ANcestral GEnomeS maps , 2012, Bioinform..

[25]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[26]  Annie Chateau,et al.  Reconstructing Ancestral Gene Orders Using Conserved Intervals , 2004, WABI.

[27]  David Sankoff,et al.  Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement , 1995, Algorithmica.

[28]  Katharina Jahn Efficient Computation of Approximate Gene Clusters Based on Reference Occurrences , 2011, J. Comput. Biol..

[29]  P. Pevzner,et al.  Breakpoint graphs and ancestral genome reconstructions. , 2009, Genome research.

[30]  Matthieu Muffato,et al.  The 3D organization of chromatin explains evolutionary fragile genomic regions. , 2015, Cell reports.

[31]  Krister M. Swenson,et al.  Error Detection and Correction of Gene Trees , 2013, Models and Algorithms for Genome Evolution.

[32]  Cédric Chauve,et al.  A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes , 2008, PLoS Comput. Biol..

[33]  Jens Stoye,et al.  Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences , 2004, CPM.

[34]  Neil D. Rawlings,et al.  New mini- zincin structures provide a minimal scaffold for members of this metallopeptidase superfamily , 2014, BMC Bioinformatics.

[35]  Annelyse Thévenin,et al.  On the distribution of cycles and paths in multichromosomal breakpoint graphs and the expected value of rearrangement distance , 2015, BMC Bioinformatics.

[36]  G. Bejerano,et al.  A "forward genomics" approach links genotype to phenotype using independent phenotypic losses among related species. , 2012, Cell reports.

[37]  Pedro Feijão,et al.  Reconstruction of ancestral gene orders using intermediate genomes , 2015, BMC Bioinformatics.

[38]  Jens Stoye,et al.  Computation of Median Gene Clusters , 2009, J. Comput. Biol..

[39]  Ján Manuch,et al.  Consistency of Sequence-Based Gene Clusters , 2011, J. Comput. Biol..

[40]  Ján Manuch,et al.  Linearization of ancestral multichromosomal genomes , 2012, BMC Bioinformatics.

[41]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[42]  Jian Ma,et al.  DUPCAR: Reconstructing Contiguous Ancestral Regions with Duplications , 2008, J. Comput. Biol..