Scaffold Filling under the Breakpoint Distance

Motivated by the trend of genome sequencing without completing the sequence of the whole genomes, Munoz et al. recently studied the problem of filling an incomplete multichromosomal genome (or scaffold) I with respect to a complete target genome G such that the resulting genomic distance between I′ and G is minimized, where I′ is the corresponding filled scaffold. We call this problem the one-sided scaffold filling problem. In this paper, we follow Munoz et al. to investigate the scaffold filling problem under the breakpoint distance for the simplest unichromosomal genomes.When the input genome contains no gene repetition (i.e., is a fragment of a permutation), we show that the two-sided scaffold filling problem is polynomially solvable. However, when the input genome contains some genes which appear twice, even the one-sided scaffold filling problem becomes NP-complete. Finally, using the ideas for solving the two-sided scaffold filling problem under the breakpoint distance we show that the two-sided scaffold filling problem under the genomic/rearrangement distance is also polynomially solvable.

[1]  Peter Damaschke,et al.  Minimum Common String Partition Parameterized , 2008, WABI.

[2]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[3]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[4]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[5]  Sorin C. Popescu,et al.  Lidar Remote Sensing , 2011 .

[6]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[7]  Glenn Tesler,et al.  Efficient algorithms for multichromosomal genome rearrangements , 2002, J. Comput. Syst. Sci..

[8]  Haim Kaplan,et al.  The greedy algorithm for edit distance with moves , 2006, Inf. Process. Lett..

[9]  B. Birren,et al.  Genome Project Standards in a New Era of Sequencing , 2009, Science.

[10]  Hong Zhu,et al.  Minimum Common String Partition Revisited , 2010, FAW.

[11]  W. Ewens,et al.  The chromosome inversion problem , 1982 .

[12]  Marek Chrobak,et al.  The greedy algorithm for the minimum common string partition problem , 2005, TALG.

[13]  Tao Jiang,et al.  Computing the Assignment of Orthologous Genes via Genome Rearrangement , 2005, APBC.

[14]  Petr Kolman,et al.  Minimum Common String Partition Problem: Hardness and Approximations , 2004, Electron. J. Comb..

[15]  David Sankoff,et al.  Scaffold filling, contig fusion and comparative gene order inference , 2010, BMC Bioinformatics.

[16]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[17]  Shi Ying,et al.  Frontiers in Algorithmics , 2010, Lecture Notes in Computer Science.