Complexity and Polynomial-Time Approximation Algorithms around the Scaffolding Problem

We explore in this paper some complexity issues inspired by the contig scaffolding problem in bioinformatics. We focus on the following problem: given an undirected graph with no loop, and a perfect matching on this graph, find a set of cycles and paths covering every vertex of the graph, with edges alternatively in the matching and outside the matching, and satisfying a given constraint on the numbers of cycles and paths. We show that this problem is \(\mathcal{NP}\)-complete, even in bipartite graphs. We also exhibit non-approximability and polynomial-time approximation results, in the optimization versions of the problem.

[1]  Raphael Yuster,et al.  Approximation algorithms and hardness results for cycle packing problems , 2007, ACM Trans. Algorithms.

[2]  Murray Patterson,et al.  Hypergraph Covering Problems Motivated by Genome Assembly Questions , 2013, IWOCA.

[3]  Eugene L. Lawler,et al.  The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization , 1985 .

[4]  Teofilo F. Gonzalez,et al.  P-Complete Approximation Problems , 1976, J. ACM.

[5]  Shinya Fujita,et al.  Covering vertices by a specified number of disjoint cycles, edges and isolated vertices , 2013, Discret. Math..

[6]  Harold N. Gabow,et al.  An Efficient Implementation of Edmonds' Algorithm for Maximum Matching on Graphs , 1976, JACM.

[7]  W. T. Tutte A Short Proof of the Factor Theorem for Finite Graphs , 1954, Canadian Journal of Mathematics.

[8]  A. Nijenhuis Combinatorial algorithms , 1975 .

[9]  Adel Dayarian,et al.  SOPRA: Scaffolding algorithm for paired reads via statistical optimization , 2010, BMC Bioinformatics.

[10]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[11]  George Steiner,et al.  On the k-path partition of graphs , 2003, Theor. Comput. Sci..

[12]  Eugene W. Myers,et al.  The greedy path-merging algorithm for contig scaffolding , 2002, JACM.

[13]  Andrew C. Adey,et al.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions , 2013, Nature Biotechnology.

[14]  Wing-Kin Sung,et al.  Opera: Reconstructing Optimal Genomic Scaffolds with High-Throughput Paired-End Sequences , 2011, J. Comput. Biol..

[15]  Marcel J. T. Reinders,et al.  GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies , 2012, Bioinform..

[16]  Nilgun Donmez,et al.  SCARPA: scaffolding reads with practical algorithms , 2013, Bioinform..