Approximation of greedy algorithms for Max-ATSP, Maximal Compression, and Shortest Cyclic Cover of Strings

Given a directed graph G with weights on its arcs, the Maximum Asymmetric Travelling Salesman Problem (Max- ATSP) asks for a Hamiltonian path of maximum weight covering G. Max-ATSP, a central problem in computer science, is known to NP-hard and hard to approximate. In the general case, when the Triangle Inequality is not satisfied, the best approximation ratio known to date equals 2/3. Now consider the Overlap Graph for a set of finite words P := {s1 , . . . , s p }: the directed graph in which an arc links two words with a weight equals to the length of their maximal overlap. When Max-ATSP is applied to the Overlap Graph, it solves the Maximal Compression or Shortest Superstring problem, where one searches for a string of minimal length having each input word as a substring. Again these problems are hard to approximate. Both for Max-ATSP and for Maximal Compression, good approximation algorithms use a cover of the graph by a set of cycles or of the words by a set of cyclic strings. These questions are known as Maximal Directed Cyclic Cover (MDCC) and as Shortest Cyclic Cover of Strings (SCCS), and can be solved in polynomial time. However, among these four problems, the approximation ratio achieved by a simple greedy algorithm is known only for Maximal Compression. In a seminal but complex proof, Tarhio and Ukkonen showed that it achieves 1/2 compression ratio. Taking advantage of the power of subset systems, we investigate the approximation of associated greedy algorithms for these four problems, and show they reach a ratio of 1/3 for Max- ATSP, 1/2 for Maximal Compression and for Maximal Cyclic Cover, and gives an exact solution for the Shortest Cyclic Cover of Strings. The proof for Maximal Compression is simpler than known ones. For these problems, greedy algorithms are easier to implement and often faster than existing approximation algorithms, an important fact since these problems have practical applications, for instance in data compression and computational biology.

[1]  David Maier,et al.  On Finding Minimal Length Superstrings , 1980, J. Comput. Syst. Sci..

[2]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[3]  Esko Ukkonen,et al.  A Greedy Approximation Algorithm for Constructing Shortest Common Superstrings , 1988, Theor. Comput. Sci..

[4]  Jonathan S. Turner,et al.  Approximation Algorithms for the Shortest Common Superstring Problem , 1989, Inf. Comput..

[5]  Tao Jiang,et al.  Linear approximation of shortest superstrings , 1991, STOC '91.

[6]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[7]  Bodo Manthey,et al.  Approximating Maximum Weight Cycle Covers in Directed Graphs with Weights Zero and One , 2005, Algorithmica.

[8]  Moshe Lewenstein,et al.  Approximation algorithms for asymmetric TSP by decomposing directed regular multigraphs , 2005, JACM.

[9]  Markus Bläser,et al.  Improved approximation algorithms for metric maximum ATSP and maximum 3-cycle cover problems , 2005, Oper. Res. Lett..

[10]  Georg Schnitger,et al.  On the Greedy Superstring Conjecture , 2003, SIAM J. Discret. Math..

[11]  Julián Mestre,et al.  Greedy in Approximation Algorithms , 2006, ESA.

[12]  Arthur Cayley,et al.  The Collected Mathematical Papers: On Monge's “Mémoire sur la théorie des déblais et des remblais” , 2009 .

[13]  Khaled M. Elbassioni,et al.  Simpler Approximation of the Maximum Asymmetric Traveling Salesman Problem , 2012, STACS.

[14]  Marcin Mucha,et al.  Lyndon Words and Short Superstrings , 2012, SODA.