A Spectral Approach to Protein Structure Alignment

A new intrinsic geometry based on a spectral analysis is used to motivate methods for aligning protein folds. The geometry is induced by the fact that a distance matrix can be scaled so that its eigenvalues are positive. We provide a mathematically rigorous development of the intrinsic geometry underlying our spectral approach and use it to motivate two alignment algorithms. The first uses eigenvalues alone and dynamic programming to quickly compute a fold alignment. Family identification results are reported for the Skolnick40 and Proteus300 data sets. The second algorithm extends our spectral method by iterating between our intrinsic geometry and the 3D geometry of a fold to make high-quality alignments. Results and comparisons are reported for several difficult fold alignments. The second algorithm's ability to correctly identify fold families in the Skolnick40 and Proteus300 data sets is also established.

[1]  Forbes J. Burkowski Comprar Structural Bioinformatics: An Algorithmic Approach | Forbes J. Burkowski | 9781584886839 | Informa Healthcare , 2008 .

[2]  Luonan Chen,et al.  Evaluating Protein Similarity from Coarse Structures , 2009, IEEE ACM Trans. Comput. Biol. Bioinform..

[3]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[4]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[5]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[6]  Gordon M. Crippen,et al.  Distance Geometry and Molecular Conformation , 1988 .

[7]  Liisa Holm,et al.  Advances and pitfalls of protein structural alignment. , 2009, Current opinion in structural biology.

[8]  Forbes J. Burkowski Structural Bioinformatics - An Algorithmic Approach , 2008, Chapman and Hall / CRC mathematical and computational biology series.

[9]  Piero Fariselli,et al.  Fast overlapping of protein contact maps by alignment of eigenvectors , 2010, Bioinform..

[10]  Tetsuo Shibuya Fast Hinge Detection Algorithms for Flexible Protein Structures , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  Harvey J. Greenberg,et al.  Quadratic Binary Programming Models in Computational Biology , 2008, Algorithmic Oper. Res..

[12]  K Nishikawa,et al.  A geometrical constraint approach for reproducing the native backbone conformation of a protein , 1993, Proteins.

[13]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[14]  Timothy F. Havel,et al.  The combinatorial distance geometry method for the calculation of molecular conformation. I. A new approach to an old problem. , 1983, Journal of theoretical biology.

[15]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[16]  Patrice Koehl,et al.  The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..

[17]  Natalio Krasnogor,et al.  Search strategies in structural bioinformatics. , 2008, Current protein & peptide science.

[18]  Allen Holder,et al.  Fast Protein Structure Alignment , 2010, ISBRA.

[19]  Robert D. Carr,et al.  101 optimal PDB structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem , 2001, RECOMB.

[20]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[21]  Lenore Cowen,et al.  Matt: Local Flexibility Aids Protein Multiple Structure Alignment , 2008, PLoS Comput. Biol..

[22]  Mattias Ohlsson,et al.  Matching protein structures with fuzzy alignments , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Luonan Chen,et al.  Revealing divergent evolution, identifying circular permutations and detecting active-sites by protein structure comparison , 2006, BMC Structural Biology.

[24]  Wei Xie,et al.  A Reduction-Based Exact Algorithm for the Contact Map Overlap Problem , 2007, J. Comput. Biol..

[25]  Rumen Andonov,et al.  An Efficient Lagrangian Relaxation for the Contact Map Overlap Problem , 2008, WABI.

[26]  R. Cuninghame-Green,et al.  Applied Linear Algebra , 1979 .

[27]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[28]  Rumen Andonov,et al.  Maximum Contact Map Overlap Revisited , 2011, J. Comput. Biol..

[29]  William R. Taylor,et al.  Protein bioinformatics - an algorithmic approach to sequence and structure analysis , 2004 .

[30]  Kunihiko Sadakane,et al.  Linear-time protein 3-D structure searching with insertions and deletions , 2009, Algorithms for Molecular Biology.

[31]  M Vendruscolo,et al.  Recovery of protein structure from contact maps. , 1997, Folding & design.

[32]  Aleksandar Poleksic,et al.  Algorithms for optimal protein structure alignment , 2009, Bioinform..

[33]  Robert D. Carr,et al.  1001 Optimal PDB Structure Alignments: Integer Programming Methods for Finding the Maximum Contact Map Overlap , 2004, J. Comput. Biol..

[34]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[35]  M. Levitt,et al.  Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core , 1993, Current Biology.