On the Complexity of Protein Local Structure Alignment Under the Discrete Fréchet Distance

Protein structure alignment is a fundamental problem in computational and structural biology. While there has been lots of experimental/heuristic methods and empirical results, very few results are known regarding the algorithmic/complexity aspects of the problem, especially on protein local structure alignment. A well-known measure to characterize the similarity of two polygonal chains is the famous Fréchet distance, and with the application of protein-related research, a related discrete Fréchet distance has been used recently. In this paper, following the recent work of Jiang et al. we investigate the protein local structural alignment problem using bounded discrete Fréchet distance. Given m proteins (or protein backbones, which are 3D polygonal chains), each of length O(n), our main results are summarized as follows: * If the number of proteins, m, is not part of the input, then the problem is NP-complete; moreover, under bounded discrete Fréchet distance it is NP-hard to approximate the maximum size common local structure within a factor of n(1-epsilon). These results hold both when all the proteins are static and when translation/rotation are allowed. * If the number of proteins, m, is a constant, then there is a polynomial time solution for the problem.

[1]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[2]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[3]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1994, SIAM J. Comput..

[4]  Binhai Zhu,et al.  Protein Structure-Structure Alignment with Discrete Fr'echet Distance , 2007, APBC.

[5]  Samarjit Chakraborty,et al.  Computing Largest Common Point Sets under Approximate Congruence , 2000, ESA.

[6]  Helmut Alt,et al.  Matching Polygonal Curves with Respect to the Fréchet Distance , 2001, STACS.

[7]  Johan Håstad,et al.  Clique is hard to approximate within n/sup 1-/spl epsiv// , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[8]  H. Mannila,et al.  Computing Discrete Fréchet Distance ∗ , 1994 .

[9]  H. Hahn Sur quelques points du calcul fonctionnel , 1908 .

[10]  Shuai Cheng Li,et al.  Finding Compact Structural Motifs , 2007, CPM.

[11]  Helmut Alt,et al.  Measuring the resemblance of polygonal curves , 1992, SCG '92.

[12]  Helmut Alt,et al.  Computing the Fréchet distance between two polygonal curves , 1995, Int. J. Comput. Geom. Appl..

[13]  Lars Engebretsen,et al.  Clique Is Hard To Approximate Within , 2000 .

[14]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1995, SIAM J. Comput..

[15]  Kevin Buchin,et al.  Computing the Fréchet distance between simple polygons in polynomial time , 2006, SCG '06.

[16]  Piotr Indyk,et al.  Approximate nearest neighbor algorithms for Frechet distance via product metrics , 2002, SCG '02.

[17]  Binhai Zhu,et al.  Protein Structure-structure Alignment with Discrete FrÉchet Distance , 2008, J. Bioinform. Comput. Biol..

[18]  Marcus Schaefer,et al.  Paired Pointset Traversal , 2004, ISAAC.

[19]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .

[20]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[21]  J. Håstad Clique is hard to approximate within n 1-C , 1996 .

[22]  Homayoun Valafar,et al.  Tali: Local Alignment of protein Structures Using Backbone Torsion Angles , 2008, J. Bioinform. Comput. Biol..

[23]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[24]  Alon Itai,et al.  Geometry Helps in Bottleneck Matching and Related Problems , 2001, Algorithmica.

[25]  Carola Wenk,et al.  Shape matching in higher dimensions , 2003 .

[26]  Ruth Nussinov,et al.  Recognition of Binding Patterns Common to a Set of Protein Structures , 2005, RECOMB.

[27]  Tatsuya Akutsu,et al.  Protein Structure Alignment Using Dynamic Programing and Iterative Improvement , 1996 .

[28]  Liisa Holm,et al.  DaliLite workbench for protein structure comparison , 2000, Bioinform..

[29]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[30]  M. Godau On the complexity of measuring the similarity between geometric objects in higher dimensions , 1999 .

[31]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[32]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..