The Zero Exemplar Distance Problem

Given two genomes with duplicate genes, ZERO EXEMPLAR DISTANCE is the problem of deciding whether the two genomes can be reduced to the same genome without duplicate genes by deleting all but one copy of each gene in each genome. Blin, Fertin, Sikora, and Vialette recently proved that ZERO EXEMPLAR DISTANCE for monochromosomal genomes is NP-hard even if each gene appears at most two times in each genome, thereby settling an important open question on genome rearrangement in the exemplar model. In this paper, we give a very simple alternative proof of this result. We also study the problem ZERO EXEMPLAR DISTANCE for multichromosomal genomes without gene order: from one direction, we show that this problem is NP-hard even if each gene appears at most two times in each genome; from the other direction, we show that this problem admits a polynomial-time algorithm if only one of the two genomes has duplicate genes, and is fixed-parameter tractable if the parameter is the maximum number of chromosomes in each genome.

[1]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[2]  Minghui Jiang The Zero Exemplar Distance Problem , 2011, J. Comput. Biol..

[3]  Bin Fu,et al.  The Approximability of the Exemplar Breakpoint Distance Problem , 2006, AAIM.

[4]  Bin Fu,et al.  On the inapproximability of the exemplar conserved interval distance problem of genomes , 2008, J. Comb. Optim..

[5]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[6]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[7]  David Sankoff,et al.  Genome rearrangement with gene families , 1999, Bioinform..

[8]  Bin Fu,et al.  Lower Bounds on the Approximation of the Exemplar Conserved Interval Distance Problem of Genomes , 2006, COCOON.

[9]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[10]  Maxime Crochemore,et al.  Algorithms on strings , 2007 .

[11]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[12]  P. Bonizzoni,et al.  Exemplar Longest Common Subsequence , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  David Sankoff,et al.  Original Synteny , 1996, CPM.

[14]  Guillaume Fertin,et al.  On the Approximability of Comparing Genomes with Duplicates , 2008, J. Graph Algorithms Appl..

[15]  Carlos Eduardo Ferreira,et al.  Repetition-free longest common subsequence , 2008, Electron. Notes Discret. Math..

[16]  Guillaume Fertin,et al.  The ExemplarBreakpointDistancefor Non-trivial Genomes Cannot Be Approximated , 2009, WALCOM.