A Practical Solution for Aligning and Simplifying Pairs of Protein Backbones under the Discrete Fréchet Distance

Aligning and comparing two polygonal chains in 3D space is an important problem in many areas of research, like in protein structure alignment. A lot of research has been done in the past on this problem, using RMSD as the distance measure. Recently, the discrete Frechet distance has been applied to align and simplify protein backbones (geometrically, 3D polygonal chains) by Jiang et al., with insightful new results found. On the other hand, as a protein backbone can have as many as 500-600 vertices, even if a pair of chains are nicely aligned, as long as they are not identical, it is still difficult for humans to visualize their similarity and difference. In 2008, a problem called CPS-3F was proposed to simplify a pair of 3D chains simultaneously under the discrete Frechet distance. However, it is still open whether CPS-3F is NP-complete or not. In this paper, we first present a new practical method to align a pair of protein backbones, improving the previous method by Jiang et al. Finally, we present a greedy-and-backtrack method, using the new alignment method as a subroutine, to handle the CPS-3F problem. We also prove two simple lemmas, giving some evidence to why our new method works well. Some preliminary empirical results using some proteins from the Protein Data Bank (PDB), with comparisons to the previous method, are presented.

[1]  Eduardo Sany Laber,et al.  LATIN 2008: Theoretical Informatics, 8th Latin American Symposium, Búzios, Brazil, April 7-11, 2008, Proceedings , 2008, Lecture Notes in Computer Science.

[2]  Helmut Alt,et al.  Approximate Matching of Polygonal Shapes (Extended Abstract) , 1991, SCG.

[3]  M. Hermodson,et al.  Structural homology between rbs repressor and ribose binding protein implies functional similarity , 1992, Protein science : a publication of the Protein Society.

[4]  M. Fréchet Sur quelques points du calcul fonctionnel , 1906 .

[5]  Liisa Holm,et al.  DaliLite workbench for protein structure comparison , 2000, Bioinform..

[6]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[7]  Richard Cole,et al.  Slowing down sorting networks to obtain faster sorting algorithms , 2015, JACM.

[8]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[9]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[10]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[11]  Sergey Bereg,et al.  Voronoi Diagram of Polygonal Chains under the Discrete FRéChet Distance , 2007, Int. J. Comput. Geom. Appl..

[12]  Sergey Bereg,et al.  Simplifying 3D Polygonal Chains Under the Discrete Fréchet Distance , 2008, LATIN.

[13]  Dong Xu,et al.  ProteinDBS: a real-time retrieval system for protein structure comparison , 2004, Nucleic Acids Res..

[14]  H. Mannila,et al.  Computing Discrete Fréchet Distance ∗ , 1994 .

[15]  Helmut Alt,et al.  Measuring the resemblance of polygonal curves , 1992, SCG '92.

[16]  Helmut Alt,et al.  Computing the Fréchet distance between two polygonal curves , 1995, Int. J. Comput. Geom. Appl..

[17]  Binhai Zhu,et al.  Protein Structure-structure Alignment with Discrete FrÉchet Distance , 2008, J. Bioinform. Comput. Biol..

[18]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[19]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[20]  Carola Wenk,et al.  Shape matching in higher dimensions , 2003 .

[21]  Tim J. P. Hubbard,et al.  SCOP: a Structural Classification of Proteins database , 1999, Nucleic Acids Res..

[22]  Helmut Alt,et al.  Matching Polygonal Curves with Respect to the Fréchet Distance , 2001, STACS.

[23]  Jinn-Moon Yang,et al.  Protein structure database search and evolutionary classification , 2006, Nucleic acids research.

[24]  Binhai Zhu On the Complexity of Protein Local Structure Alignment Under the Discrete Fréchet Distance , 2007, J. Comput. Biol..

[25]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.