An automatic method for assessing structural importance of amino acid positions

BackgroundA great deal is known about the qualitative aspects of the sequence-structure relationship, for example that buried residues are usually more conserved between structurally similar homologues, but no attempts have been made to quantitate the relationship between evolutionary conservation at a sequence position and change to global tertiary structure. In this paper we demonstrate that the Spearman correlation between sequence and structural change is suitable for this purpose.ResultsBuried residues, bends, cysteines, prolines and leucines were significantly more likely to occupy positions highly correlated with structural change than expected by chance. Some buried residues were found to be less informative than expected, particularly residues involved in active sites and the binding of small molecules.ConclusionThe correlation-based method generates predictions of structural importance for superfamily positions which agree well with previous results of manual analyses, and may be of use in automated residue annotation piplines. A PERL script which implements the method is provided.

[1]  Cédric Notredame,et al.  3DCoffee: combining protein sequences and structures within multiple sequence alignments. , 2004, Journal of molecular biology.

[2]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[3]  Yong-Zi Chen,et al.  An improved prediction of catalytic residues in enzyme structures. , 2008, Protein engineering, design & selection : PEDS.

[4]  Cyrus Chothia,et al.  The selection of acceptable protein mutations , 2007, Proceedings of the National Academy of Sciences.

[5]  W. S. Valdar,et al.  Scoring residue conservation , 2002, Proteins.

[6]  Catherine Brooksbank,et al.  The European Bioinformatics Institute’s data resources , 2009, Nucleic Acids Res..

[7]  D T Jones,et al.  Benchmarking template selection and model quality assessment for high‐resolution comparative modeling , 2007, Proteins.

[8]  G. Chelvanayagam,et al.  Anatomy and evolution of proteins displaying the viral capsid jellyroll topology. , 1992, Journal of molecular biology.

[9]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Evgeny B. Krissinel,et al.  On the relationship between sequence and structure similarities in proteomics , 2007, Bioinform..

[11]  A. Valencia,et al.  Automatic methods for predicting functionally important residues. , 2003, Journal of molecular biology.

[12]  C. Betzel,et al.  Modulation of phospholipase A2 activity generated by molecular evolution , 1999, Cellular and Molecular Life Sciences CMLS.

[13]  C. Chothia Principles that determine the structure of proteins. , 1984, Annual review of biochemistry.

[14]  D. Baker,et al.  Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design , 2005, Nucleic acids research.

[15]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[16]  Frances M. G. Pearl,et al.  The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis , 2004, Nucleic Acids Res..

[17]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[18]  A. Riggs,et al.  The Structural and Functional Analysis of the Hemoglobin D Component from Chicken* , 1999, The Journal of Biological Chemistry.

[19]  A M Lesk,et al.  Helix movements and the reconstruction of the haem pocket during the evolution of the cytochrome c family. , 1985, Journal of molecular biology.

[20]  A. Lesk,et al.  Determinants of a protein fold. Unique features of the globin amino acid sequences. , 1987, Journal of molecular biology.

[21]  W. Taylor Protein structure comparison using iterated double dynamic programming , 2008, Protein science : a publication of the Protein Society.

[22]  David T. Jones,et al.  Protein evolution with dependence among codons due to tertiary structure. , 2003, Molecular biology and evolution.

[23]  Sungsam Gong,et al.  A Structural Bioinformatics Approach to the Analysis of nonsynonymous Single nucleotide polymorphisms (nsSNPS) and their Relation to Disease , 2007, J. Bioinform. Comput. Biol..

[24]  A. Lesk,et al.  How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. , 1980, Journal of molecular biology.

[25]  G J Barton,et al.  Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to side-chain contacts secondary structure and accessibility. , 1994, Journal of molecular biology.