On the Minimum Common Integer Partition Problem

We introduce a new combinatorial optimization problem in this paper, called the Minimum Common Integer Partition (MCIP) problem, which was inspired by computational biology applications including ortholog assignment and DNA fingerprint assembly. A partition of a positive integer n is a multiset of positive integers that add up to exactly n, and an integer partition of a multiset S of integers is defined as the multiset union of partitions of integers in S. Given a sequence of multisets S1, ⋯, Sk of integers, where k ≥ 2, we say that a multiset is a common integer partition if it is an integer partition of every multiset Si, 1≤ i≤ k. The MCIP problem is thus defined as to find a common integer partition of S1, ⋯, Sk with the minimum cardinality. It is easy to see that the MCIP problem is NP-hard since it generalizes the well-known Set Partition problem. We can in fact show that it is APX-hard. We will also present a $\frac{5}{4}$-approximation algorithm for the MCIP problem when k = 2, and a $\frac{3k(k-1)}{3k-2}$-approximation algorithm for k ≥ 3.

[1]  Maxime Crochemore,et al.  Algorithms on strings , 2007 .

[2]  Robert C. Edgar,et al.  Multiple sequence alignment. , 2006, Current opinion in structural biology.

[3]  D. Lipman,et al.  Trees, stars, and multiple biological sequence alignment , 1989 .

[4]  Christian E. V. Storm,et al.  Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. , 2001, Journal of molecular biology.

[5]  Viggo Kann,et al.  Maximum Bounded 3-Dimensional Matching is MAX SNP-Complete , 1991, Inf. Process. Lett..

[6]  Esther M. Arkin,et al.  On Local Search for Weighted k-Set Packing , 1998, Math. Oper. Res..

[7]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[8]  Xin Chen,et al.  Assignment of orthologous genes via genome rearrangement , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[9]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[10]  G. Andrews The Theory of Partitions: Frontmatter , 1976 .

[11]  Katechan Jampachaisri,et al.  Oligonucleotide fingerprinting of ribosomal RNA genes (OFRG) , 2004 .

[12]  Petr Kolman Approximating Reversal Distance for Strings with Bounded Number of Duplicates , 2005, MFCS.

[13]  Alexander Schrijver,et al.  On the Size of Systems of Sets Every t of Which Have an SDR, with an Application to the Worst-Case Ratio of Heuristics for Packing Problems , 1989, SIAM J. Discret. Math..

[14]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[15]  Tao Jiang,et al.  Computing the Assignment of Orthologous Genes via Genome Rearrangement , 2005, APBC.

[16]  Marek Chrobak,et al.  The greedy algorithm for the minimum common string partition problem , 2005, TALG.

[17]  Petr Kolman,et al.  Minimum Common String Partition Problem: Hardness and Approximations , 2004, Electron. J. Comb..

[18]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[19]  Giorgio Gambosi,et al.  Complexity and Approximation , 1999, Springer Berlin Heidelberg.

[20]  Giorgio Gambosi,et al.  Complexity and approximation: combinatorial optimization problems and their approximability properties , 1999 .

[21]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .