Beam search for the longest common subsequence problem

The longest common subsequence problem is a classical string problem that concerns finding the common part of a set of strings. It has several important applications, for example, pattern recognition or computational biology. Most research efforts up to now have focused on solving this problem optimally. In comparison, only few works exist dealing with heuristic approaches. In this work we present a deterministic beam search algorithm. The results show that our algorithm outperforms the current state-of-the-art approaches not only in solution quality but often also in computation time.

[1]  Ihsan Sabuncuoglu,et al.  Job shop scheduling with beam search , 1999, Eur. J. Oper. Res..

[2]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[3]  Todd Easton,et al.  A large neighborhood search heuristic for the longest common subsequence problem , 2008, J. Heuristics.

[4]  Alfred V. Aho,et al.  Data Structures and Algorithms , 1983 .

[5]  David Maier,et al.  The Complexity of Some Problems on Subsequences and Supersequences , 1978, JACM.

[6]  Jorge M. S. Valente,et al.  Filtered and recovering beam search algorithms for the early/tardy scheduling problem with no idle time , 2005, Comput. Ind. Eng..

[7]  L. Bergroth,et al.  A survey of longest common subsequence algorithms , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[8]  Thomas Stützle,et al.  Ant Colony Optimization , 2009, EMO.

[9]  Francis Y. L. Chin,et al.  Performance analysis of some simple heuristics for computing longest common subsequences , 1994, Algorithmica.

[10]  Shyong Jian Shyu,et al.  Finding the longest common subsequence for multiple biological sequences by ant colony optimization , 2009, Comput. Oper. Res..

[11]  Christian Blum,et al.  Probabilistic Beam Search for the Longest Common Subsequence Problem , 2007, SLS.

[12]  Alain Guénoche,et al.  Supersequences of Masks for Oligo-chips , 2004, J. Bioinform. Comput. Biol..

[13]  Majid Sarrafzadeh,et al.  Area-efficient instruction set synthesis for reconfigurable system-on-chip designs , 2004, Proceedings. 41st Design Automation Conference, 2004..

[14]  King-Sun Fu,et al.  A Sentence-to-Sentence Clustering Procedure for Pattern Analysis , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[15]  James A. Storer,et al.  Data Compression: Methods and Theory , 1987 .

[16]  M. W. Du,et al.  Computing a longest common subsequence for a set of strings , 1984, BIT.

[17]  Todd Easton,et al.  A Specialized Branching and Fathoming Technique for The Longest Common Subsequence Problem , 2007 .

[18]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[19]  Chang-Biau Yang,et al.  Fast Algorithms for Finding the Common Subsequence of Multiple Sequences , 2004 .

[20]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1995, SIAM J. Comput..

[21]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[22]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[23]  P. Ow,et al.  Filtered beam search in scheduling , 1988 .

[24]  Paola Bonizzoni,et al.  Experimenting an approximation algorithm for the LCS , 2001, Discret. Appl. Math..

[25]  David L. Woodruff,et al.  Beam search for peak alignment of NMR signals , 2004 .

[26]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[27]  Cameron Bruce Fraser,et al.  Subsequences and Supersequences of Strings , 1995 .

[28]  Harri Hakonen,et al.  New approximation algorithms for longest common subsequences , 1998, Proceedings. String Processing and Information Retrieval: A South American Symposium (Cat. No.98EX207).

[29]  Chris N. Potts,et al.  Makespan minimization for scheduling unrelated parallel machines: A recovering beam search approach , 2005, Eur. J. Oper. Res..

[30]  Bin Ma,et al.  A General Edit Distance between RNA Structures , 2002, J. Comput. Biol..