New Results About the Linearization of Scaffolds Sharing Repeated Contigs

Solutions to genome scaffolding problems can be represented as paths and cycles in a “solution graph”. However, when working with repetitions, such solution graphs may contain branchings and, thus, they may not be uniquely convertible into sequences. Having introduced various ways of extracting the unique parts of such solutions, we extend previously known NP-hardness results to the case that the solution graph is planar, bipartite, and subcubic, and show that there is no PTAS in this case.

[1]  Johan Håstad,et al.  Some optimal inapproximability results , 2001, JACM.

[2]  R. Giroudeau,et al.  A complexity and approximation framework for the maximization scaffolding problem , 2015, Theor. Comput. Sci..

[3]  Marek Karpinski,et al.  Approximation Hardness and Satisfiability of Bounded Occurrence Instances of SAT , 2003, Electron. Colloquium Comput. Complex..

[4]  Mark de Berg,et al.  Optimal Binary Space Partitions for Segments in the Plane , 2012, Int. J. Comput. Geom. Appl..

[5]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[6]  M. Berriman,et al.  A comprehensive evaluation of assembly scaffolding tools , 2014, Genome Biology.

[7]  H. Swerdlow,et al.  A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers , 2012, BMC Genomics.

[8]  Annie Chateau,et al.  Scaffolding Problems Revisited: Complexity, Approximation and Fixed Parameter Tractable Algorithms, and Some Special Cases , 2018, Algorithmica.

[9]  Annie Chateau,et al.  On the Hardness of Approximating Linearization of Scaffolds Sharing Repeated Contigs , 2018, RECOMB-CG.

[10]  Matthias Platzer,et al.  RepARK—de novo creation of repeat libraries from whole-genome NGS reads , 2014, Nucleic acids research.

[11]  Thierry Lecroq,et al.  Querying large read collections in main memory: a versatile data structure , 2011, BMC Bioinformatics.

[12]  Haixu Tang,et al.  Genome assembly, rearrangement, and repeats. , 2007, Chemical reviews.

[13]  J. S. Heslop-Harrison,et al.  Repetitive DNA in eukaryotic genomes , 2015, Chromosome Research.

[14]  Annie Chateau,et al.  On the Linearization of Scaffolds Sharing Repeated Contigs , 2017, COCOA.

[15]  Annie Chateau,et al.  Exact approaches for scaffolding , 2015, BMC Bioinformatics.