Extraction of well-fitting substructures: root-mean-square deviation and the difference distance matrix.

The extraction of well-fitting substructures of two or more sets of proteins has applications to analysis of mechanisms of conformational change in proteins, including pathways of evolution, to classification of protein folding patterns, and to evaluation of protein structure predictions. Many methods are known for extracting some substantial common substructure with low root-mean-square deviation (r.m.s.d.). A harder problem is addressed here: finding all common substructures with r.m.s.d. less than a prespecified threshold. Our approach is to consider the minimum value of the maximum distance between corresponding points, corresponding to superposition in the Chebyshev norm. Using the properties of Chebyshev superposition, we derive relationships between the r.m.s.d. and the maximum element of the difference matrix, two common measures of structural similarity. The results provide a basis for developing algorithms and software to identify all well-fitting subsets.

[1]  Jack Belzer,et al.  Encyclopedia of Computer Science and Technology , 2002 .

[2]  Arthur M. Lesk,et al.  Three-Dimensional Searching for Recurrent Structural Motifs in Data Bases of Protein Structures , 1994, J. Comput. Biol..

[3]  B. Matthews,et al.  A test of the "jigsaw puzzle" model for protein folding by multiple methionine substitutions within the core of T4 lysozyme. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[4]  A M Lesk,et al.  Systematic representation of protein folding patterns. , 1995, Journal of molecular graphics.

[5]  A. Lesk,et al.  Common features of the conformations of antigen‐binding loops in immunoglobulins and application to modeling loop conformations , 1992, Proteins.

[6]  Arthur M. Lesk,et al.  Three-Dimensional Pattern Matching in Protein Structure Analysis , 1995, CPM.

[7]  P Willett,et al.  Use of techniques derived from graph theory to compare secondary structure motifs in proteins. , 1990, Journal of molecular biology.

[8]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[9]  N. Go,et al.  Common spatial arrangements of backbone fragments in homologous and non-homologous proteins. , 1992, Journal of molecular biology.

[10]  G. Rose,et al.  Rigid domains in proteins: An algorithmic approach to their identification , 1995, Proteins.

[11]  Mark Gerstein,et al.  How far can sequences diverge? , 1997, Nature.

[12]  Hiroshi Imai,et al.  Minimax geometric fitting of two corresponding sets of points , 1989, SCG '89.

[13]  A. Lesk COMPUTATIONAL MOLECULAR BIOLOGY , 1988, Proceeding of Data For Discovery.

[14]  C. Sander,et al.  Detection of common three‐dimensional substructures in proteins , 1991, Proteins.

[15]  G M Crippen,et al.  Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins. , 1994, Journal of molecular biology.

[16]  Kurt Mehlhorn,et al.  Congruence, similarity, and symmetries of geometric objects , 1987, SCG '87.

[17]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[18]  Gene H. Golub,et al.  Matrix computations , 1983 .

[19]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.