DCJ-Indel sorting revisited

BackgroundThe introduction of the double cut and join operation (DCJ) caused a flurry of research into the study of multichromosomal rearrangements. However, little of this work has incorporated indels (i.e., insertions and deletions of chromosomes and chromosomal intervals) into the calculation of genomic distance functions, with the exception of Braga et al., who provided a linear time algorithm for the problem of DCJ-indel sorting. Although their algorithm only takes linear time, its derivation is lengthy and depends on a large number of possible cases.ResultsWe note the simple idea that a deletion of a chromosomal interval can be viewed as a DCJ that creates a new circular chromosome. This framework will allow us to amortize indels as DCJs, which in turn permits the application of the classical breakpoint graph to obtain a simplified indel model that still solves the problem of DCJ-indel sorting in linear time via a more concise formulation that relies on the simpler problem of DCJ sorting. Furthermore, we can extend this result to fully characterize the solution space of DCJ-indel sorting.ConclusionsEncoding indels as DCJ operations offers a new insight into why the problem of DCJ-indel sorting is not ultimately any more difficult than that of sorting by DCJs alone. There is still room for research in this area, most notably the problem of sorting when the cost of indels is allowed to vary with respect to the cost of a DCJ and we demand a minimum cost transformation of one genome into another.

[1]  Jens Stoye,et al.  A Unifying View of Genome Rearrangements , 2006, WABI.

[2]  Guillaume Fertin,et al.  Pancake Flipping Is Hard , 2011, MFCS.

[3]  Jens Stoye,et al.  On the weight of indels in genomic distances , 2011, BMC Bioinformatics.

[4]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[5]  T. Dobzhansky,et al.  Inversions in the Chromosomes of Drosophila Pseudoobscura. , 1938, Genetics.

[6]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[7]  Phillip E. C. Compeau,et al.  A Simplified View of DCJ-Indel Distance , 2012, WABI.

[8]  David Haussler,et al.  The infinite sites model of genome evolution , 2008, Proceedings of the National Academy of Sciences.

[9]  Vineet Bafna,et al.  Genome rearrangements and sorting by reversals , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[10]  Ivan Hal Sudborough,et al.  An (18/11)n upper bound for sorting by prefix reversals , 2009, Theor. Comput. Sci..

[11]  Ivan Hal Sudborough,et al.  On the Diameter of the Pancake Network , 1997, J. Algorithms.

[12]  Jens Stoye,et al.  The Solution Space of Sorting by DCJ , 2010, J. Comput. Biol..

[13]  Richard Friedberg,et al.  DCJ Path Formulation for Genome Transformations which Include Insertions, Deletions, and Duplications , 2009, J. Comput. Biol..

[14]  Jens Stoye,et al.  Genomic Distance with DCJ and Indels , 2010, WABI.

[15]  Christos H. Papadimitriou,et al.  Bounds for sorting by prefix reversal , 1979, Discret. Math..

[16]  Guillaume Fertin,et al.  Combinatorics of Genome Rearrangements , 2009, Computational molecular biology.