Tripeptide analysis of protein structures

BackgroundAn efficient building block for protein structure prediction can be tripeptides. 8000 different tripeptides from a dataset of 1220 high resolution (≤ 2.0°A) structures from the Protein Data Bank (PDB) have been looked at, to determine which are structurally rigid and non-rigid. This data has been statistically analyzed, discussed and summarized. The entire data can be utilized for the building of protein structures.ResultsTripeptides have been classified into three categories: rigid, non-rigid and intermediate, based on the relative structural rigidity between Cα and Cβ atoms in a tripeptide. We found that 18% of the tripeptides in the dataset can be classified as rigid, 4% as non-rigid and 78% as intermediate. Many rigid tripeptides are made of hydrophobic residues, however, there are tripeptides with polar side chains forming rigid structures. The bulk of the tripeptides fall in the intermediate class while very small numbers actually fall in the non-rigid class. Structurally all rigid tripeptides essentially form two structural classes while the intermediate and non-rigid tripeptides fall into one structural class. This notion of rigidity and non-rigidity is designed to capture side chain interactions but not secondary structures.ConclusionsRigid tripeptides have no correlation with the secondary structures in proteins and hence this work is complementary to such studies. Tripeptide data may be used to predict plausible structures for oligopeptides and for denovo protein design.

[1]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[2]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[3]  C. Pabo Molecular technology: Designing proteins and peptides , 1983, Nature.

[4]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[5]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[6]  M J Rooman,et al.  Automatic definition of recurrent local structure motifs in proteins. , 1990, Journal of molecular biology.

[7]  S. Wodak,et al.  Relations between protein sequence and structure and their significance. , 1990, Journal of molecular biology.

[8]  M. Levitt Accurate modeling of protein conformation by automatic segment matching. , 1992, Journal of molecular biology.

[9]  J. Thornton,et al.  Atlas of protein side-chain interactions , 1992 .

[10]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[11]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[12]  J. Richardson,et al.  Betadoublet: de novo design, synthesis, and characterization of a beta-sandwich protein. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[13]  T. Blundell,et al.  Knowledge-based protein modeling. , 1994, Critical reviews in biochemistry and molecular biology.

[14]  D. Schomburg,et al.  Prediction of protein three-dimensional structures in insertion and deletion regions: a procedure for searching data bases of representative protein fragments using geometric scoring criteria. , 1995, Journal of molecular biology.

[15]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[16]  M. Palumbo,et al.  Patterns, structures, and amino acid frequencies in structural building blocks, a protein secondary structure classification scheme , 1997, Proteins.

[17]  K. Gunasekaran,et al.  Beta-hairpins in proteins revisited: lessons for de novo design. , 1997, Protein engineering.

[18]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[19]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[20]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[21]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[22]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[23]  A Maritan,et al.  Recurrent oligomers in proteins: An optimal scheme reconciling accurate and concise backbone representations in automated folding and design studies , 2000, Proteins.

[24]  Kunchur Guruprasad,et al.  Database of Structural Motifs in Proteins , 2000, Bioinform..

[25]  Richard Bonneau,et al.  Rosetta in CASP4: Progress in ab initio protein structure prediction , 2001, Proteins.

[26]  Richard Bonneau,et al.  De novo prediction of three-dimensional structures for major protein families. , 2002, Journal of molecular biology.

[27]  D. Baker,et al.  De novo determination of protein backbone structure from residual dipolar couplings using Rosetta. , 2002, Journal of the American Chemical Society.