Approximating the edit distance for genomes with duplicate genes under DCJ, insertion and deletion

Computing the edit distance between two genomes under certain operations is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ) model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be easily computed for genomes without duplicate genes. In this paper, we study the edit distance for genomes with duplicate genes under a model that includes DCJ operations, insertions and deletions. We prove that computing the edit distance is equivalent to finding the optimal cycle decomposition of the corresponding adjacency graph, and give an approximation algorithm with an approximation ratio of 1.5 + ∈.

[1]  Macha Nikolski,et al.  Genome rearrangements: a correct algorithm for optimal capping , 2007, Inf. Process. Lett..

[2]  Ron Shamir,et al.  Two Notes on Genome Rearrangement , 2003, J. Bioinform. Comput. Biol..

[3]  Xin Chen,et al.  On Sorting Permutations by Double-Cut-and-Joins , 2010, COCOON.

[4]  David Sankoff,et al.  Detection and validation of single gene inversions , 2003, ISMB.

[5]  Glenn Tesler,et al.  Efficient algorithms for multichromosomal genome rearrangements , 2002, J. Comput. Syst. Sci..

[6]  Alexander Schrijver,et al.  On the Size of Systems of Sets Every t of Which Have an SDR, with an Application to the Worst-Case Ratio of Heuristics for Packing Problems , 1989, SIAM J. Discret. Math..

[7]  Richard Friedberg,et al.  Sorting Genomes with Insertions, Deletions and Duplications by DCJ , 2008, RECOMB-CG.

[8]  Alberto Caprara,et al.  Improved Approximation for Breakpoint Graph Decomposition and Sorting by Reversals , 2002, J. Comb. Optim..

[9]  David A. Bader,et al.  A fast linear-time algorithm for inversion distance with an experimental comparison , 2001 .

[10]  David Sankoff,et al.  Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement , 1995, Algorithmica.

[11]  Guillaume Fertin,et al.  On the Approximability of Comparing Genomes with Duplicates , 2008, J. Graph Algorithms Appl..

[12]  E. Eichler,et al.  Primate segmental duplications: crucibles of evolution, diversity and disease , 2006, Nature Reviews Genetics.

[13]  Xin Chen,et al.  Assignment of orthologous genes via genome rearrangement , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  Jens Stoye,et al.  A new linear time algorithm to compute the genomic distance via the double cut and join distance , 2009, Theor. Comput. Sci..

[15]  Jens Stoye,et al.  Genomic Distance with DCJ and Indels , 2010, WABI.

[16]  Magnús M. Halldórsson,et al.  Approximating discrete collections via local improvements , 1995, SODA '95.

[17]  Richard Durrett,et al.  Dependence of paracentric inversion rate on tract length , 2007, BMC Bioinformatics.

[18]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[19]  David A. Christie,et al.  A 3/2-approximation algorithm for sorting by reversals , 1998, SODA '98.

[20]  Xin Chen,et al.  Approximating the double-cut-and-join distance between unsigned genomes , 2011, BMC Bioinformatics.

[21]  Guillaume Fertin,et al.  Combinatorics of Genome Rearrangements , 2009, Computational molecular biology.

[22]  David A. Bader,et al.  A Linear-Time Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study , 2001, J. Comput. Biol..

[23]  Tao Jiang,et al.  A Further Improved Approximation Algorithm for Breakpoint Graph Decomposition , 2004, J. Comb. Optim..

[24]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[25]  Jens Stoye,et al.  A Unifying View of Genome Rearrangements , 2006, WABI.

[26]  Guillaume Fertin,et al.  A Pseudo-Boolean Framework for Computing Rearrangement Distances between Genomes with Duplicates , 2007, J. Comput. Biol..

[27]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.