A substitution matrix for structural alphabet based on structural alignment of homologous proteins and its applications

Analysis of protein structures based on backbone structural patterns known as structural alphabets have been shown to be very useful. Among them, a set of 16 pentapeptide structural motifs known as protein blocks (PBs) has been identified and upon which backbone model of most protein structures can be built. PBs allows simplification of 3D space onto 1D space in the form of sequence of PBs. Here, for the first time, substitution probabilities of PBs in a large number of aligned homologous protein structures have been studied and are expressed as a simplified 16 × 16 substitution matrix. The matrix was validated by benchmarking how well it can align sequences of PBs rather like amino acid alignment to identify structurally equivalent regions in closely or distantly related proteins using dynamic programming approach. The alignment results obtained are very comparable to well established structure comparison methods like DALI and STAMP. Other interesting applications of the matrix have been investigated. We first show that, in variable regions between two superimposed homologous proteins, one can distinguish between local conformational differences and rigid‐body displacement of a conserved motif by comparing the PBs and their substitution scores. Second, we demonstrate, with the example of aspartic proteinases, that PBs can be efficiently used to detect the lobe/domain flexibility in the multidomain proteins. Lastly, using protein kinase as an example, we identify regions of conformational variations and rigid body movements in the enzyme as it is changed to the active state from an inactive state. Proteins 2006. © 2006 Wiley‐Liss, Inc.

[1]  Narayanaswamy Srinivasan,et al.  Protein Block Expert (PBE): a web-based protein structure analysis server using a structural alphabet , 2006, Nucleic Acids Res..

[2]  R. Agarwala,et al.  Protein database searches using compositionally adjusted substitution matrices , 2005, The FEBS journal.

[3]  A. G. Brevern,et al.  A structural model of a seven-transmembrane helix receptor: the Duffy antigen/receptor for chemokine (DARC). , 2005, Biochimica et biophysica acta.

[4]  C. Etchebest,et al.  A structural alphabet for local protein structures: Improved prediction methods , 2005, Proteins.

[5]  Pierre Tufféry,et al.  Improved greedy algorithm for protein structure reconstruction , 2005, J. Comput. Chem..

[6]  Alexandre G. de Brevern,et al.  New assessment of a structural alphabet , 2005, Silico Biol..

[7]  Jieping Ye,et al.  Pairwise Protein Structure Alignment Based on an Orientation-independent Backbone Representation , 2004, J. Bioinform. Comput. Biol..

[8]  Pierre Tufféry,et al.  SA-Search: a web tool for protein structure mining based on a Structural Alphabet , 2004, Nucleic Acids Res..

[9]  A C Camproux,et al.  A hidden markov model derived structural alphabet for proteins. , 2004, Journal of molecular biology.

[10]  Alexandre G. de Brevern,et al.  Use of a structural alphabet for analysis of short loops connecting repetitive structures , 2004, BMC Bioinformatics.

[11]  K. Karplus,et al.  Hidden Markov models that use predicted local structure for fold recognition: Alphabets of backbone geometry , 2003, Proteins.

[12]  Shankar Subramaniam,et al.  Protein local structure prediction from sequence , 2003, Proteins.

[13]  A. G. Brevern,et al.  'Hybrid Protein Model' for optimally defining 3D protein structure fragments , 2003, Bioinform..

[14]  K. Karplus,et al.  Evaluating local structure alphabets for protein structure prediction , 2003 .

[15]  Burkhard Rost,et al.  Prediction in 1D: secondary structure, membrane helices, and accessibility. , 2003, Methods of biochemical analysis.

[16]  S. Balaji,et al.  Integration of related sequences with protein three-dimensional structural families in an updated version of PALI database , 2003, Nucleic Acids Res..

[17]  H. Valadié,et al.  Extension of a local backbone description using a structural alphabet: A new approach to the sequence‐structure relationship , 2002, Protein science : a publication of the Protein Society.

[18]  M. Levitt,et al.  Small libraries of protein fragments model native protein structures accurately. , 2002, Journal of molecular biology.

[19]  S. Hazout,et al.  Compacting local protein folds with a “hybrid protein model” , 2001 .

[20]  Pierre Tufféry,et al.  Exploring the use of a structural alphabet for structural prediction of protein loops , 2001 .

[21]  S. Balaji,et al.  PALI - a database of Phylogeny and ALIgnment of homologous protein structures , 2001, Nucleic Acids Res..

[22]  C. Etchebest,et al.  Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks , 2000, Proteins.

[23]  Pierre Tufféry,et al.  Analyzing patterns between regular secondary structures using short structural building blocks defined by a hidden Markov model , 1999 .

[24]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[25]  D. Baker,et al.  Prediction of local structure in proteins using a library of sequence-structure motifs. , 1998, Journal of molecular biology.

[26]  T. Smith,et al.  Visible volume: a robust measure for protein structure characterization. , 1997, Journal of molecular biology.

[27]  J Schuchhardt,et al.  Local structural motifs of protein backbones are classified by self-organizing neural networks. , 1996, Protein engineering.

[28]  D. Baker,et al.  Recurring local sequence motifs in proteins. , 1995, Journal of molecular biology.

[29]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[30]  M. Levitt,et al.  The complexity and accuracy of discrete state models of protein structure. , 1995, Journal of molecular biology.

[31]  G J Barton,et al.  Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to side-chain contacts secondary structure and accessibility. , 1994, Journal of molecular biology.

[32]  John P. Overington,et al.  A structural basis for sequence comparisons. An evaluation of scoring methodologies. , 1993, Journal of molecular biology.

[33]  W R Taylor,et al.  A local alignment method for protein structure motifs. , 1993, Journal of molecular biology.

[34]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[35]  Ron Unger,et al.  The importance of short structural motifs in protein structure analysis , 1993, J. Comput. Aided Mol. Des..

[36]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[37]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[38]  W R Taylor,et al.  A holistic approach to protein structure alignment. , 1989, Protein engineering.

[39]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[40]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.