Sorting genomes with rearrangements and segmental duplications through trajectory graphs

We study the problem of sorting genomes under an evolutionary model that includes genomic rearrangements and segmental duplications. We propose an iterative algorithm to improve any initial evolutionary trajectory between two genomes in terms of parsimony. Our algorithm is based on a new graphical model, the trajectory graph, which models not only the final states of two genomes but also an existing evolutionary trajectory between them. We show that redundant rearrangements in the trajectory correspond to certain cycles in the trajectory graph, and prove that our algorithm converges to an optimal trajectory for any initial trajectory involving only rearrangements.

[1]  Macha Nikolski,et al.  Genome rearrangements: a correct algorithm for optimal capping , 2007, Inf. Process. Lett..

[2]  Jens Stoye,et al.  A new linear time algorithm to compute the genomic distance via the double cut and join distance , 2009, Theor. Comput. Sci..

[3]  David Haussler,et al.  Cactus Graphs for Genome Comparisons , 2010, RECOMB.

[4]  Xin Chen,et al.  On Sorting Permutations by Double-Cut-and-Joins , 2010, COCOON.

[5]  Tzvika Hartman,et al.  On the Properties of Sequences of Reversals that Sort a Signed Permutation , 2002 .

[6]  Pavel A. Pevzner,et al.  Whole Genome Duplications and Contracted Breakpoint Graphs , 2007, SIAM J. Comput..

[7]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[8]  Glenn Tesler,et al.  Efficient algorithms for multichromosomal genome rearrangements , 2002, J. Comput. Syst. Sci..

[9]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[10]  M. Lathrop,et al.  Serial translocation by means of circular intermediates underlies colour sidedness in cattle , 2012, Nature.

[11]  Marie-France Sagot,et al.  Exploring the Solution Space of Sorting by Reversals, with Experiments and an Application to Evolution , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  T. Kocher,et al.  Circular DNA Intermediate in the Duplication of Nile Tilapia vasa Genes , 2011, PloS one.

[13]  E. Eichler,et al.  Ancestral reconstruction of segmental duplications reveals punctuated cores of human genome evolution , 2007, Nature Genetics.

[14]  Yu Lin,et al.  Approximating the edit distance for genomes with duplicate genes under DCJ, insertion and deletion , 2012, BMC Bioinformatics.

[15]  Vineet Bafna,et al.  Genome Rearrangements and Sorting by Reversals , 1996, SIAM J. Comput..

[16]  Ron Shamir,et al.  Two Notes on Genome Rearrangement , 2003, J. Bioinform. Comput. Biol..

[17]  Jens Stoye,et al.  The Solution Space of Sorting by DCJ , 2010, J. Comput. Biol..

[18]  Richard Friedberg,et al.  Sorting Genomes with Insertions, Deletions and Duplications by DCJ , 2008, RECOMB-CG.

[19]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[20]  Krister M. Swenson,et al.  Genomic Distances under Deletions and Insertions , 2003, COCOON.

[21]  Vineet Bafna,et al.  Genome rearrangements and sorting by reversals , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[22]  David Sankoff,et al.  The Reconstruction of Doubled Genomes , 2003, SIAM J. Comput..

[23]  Michael Lynch,et al.  The Origins of Genome Architecture , 2007 .

[24]  Jens Stoye,et al.  Genomic Distance with DCJ and Indels , 2010, WABI.

[25]  Yu Lin,et al.  A New Genomic Evolutionary Model for Rearrangements, Duplications, and Losses That Applies across Eukaryotes and Prokaryotes , 2010, RECOMB-CG.

[26]  E. Eichler,et al.  Primate segmental duplications: crucibles of evolution, diversity and disease , 2006, Nature Reviews Genetics.

[27]  David A. Bader,et al.  A fast linear-time algorithm for inversion distance with an experimental comparison , 2001 .

[28]  Yu Lin,et al.  Maximum Likelihood Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Tree of 68 Eukaryotes , 2012, Pacific Symposium on Biocomputing.

[29]  David A. Bader,et al.  A Linear-Time Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study , 2001, J. Comput. Biol..

[30]  Borislav H. Hristov,et al.  Parsimony and likelihood reconstruction of human segmental duplications , 2010, Bioinform..

[31]  Jens Stoye,et al.  A Unifying View of Genome Rearrangements , 2006, WABI.

[32]  Xin Chen,et al.  Approximating the double-cut-and-join distance between unsigned genomes , 2011, BMC Bioinformatics.

[33]  Benjamin J. Raphael,et al.  Analysis of segmental duplications via duplication distance , 2008, ECCB.

[34]  Nadia El-Mabrouk,et al.  Genome Rearrangement by Reversals and Insertions/Deletions of Contiguous Segments , 2000, CPM.