Tools and databases to analyze protein flexibility; approaches to mapping implied features onto sequences.

Publisher Summary This chapter describes the way protein flexibility can be analyzed statistically in a database. The database of macromolecular movements, which is accessible over the Internet, organizes a few hundred well-characterized motions on the basis of size and then packing, with the involvement of a well-packed interface in the motion being a key classifying feature. The chapter describes the computational tools employed in the database analysis—namely, (1) structure comparison, which is useful to align and superpose different conformations, (2) adiabatic mapping interpolation, which is implemented on a large scale by the morph server, provides movie-like pathways between two superposed conformations, and in the process, generates many standardized statistics, (3) normal mode analysis, which provides readily interpretable information about the flexibility of a single conformation, and (4) Voronoi volume calculations, which provide a rigorous basis for characterizing packing. The chapter also explains the way structural features in the motions database can be related to sequence, an important part of the overall process of transferring annotation to uncharacterized genomic data. This allows determination of a sequence-propensity scale for amino acids to be in linkers in general or flexible hinges in particular.

[1]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[2]  S J Remington,et al.  Crystal structures of Escherichia coli glycerol kinase variant S58-->W in complex with nonhydrolyzable ATP analogues reveal a putative active conformation of the enzyme as a result of domain motion. , 1999, Biochemistry.

[3]  Cyrus Chothia,et al.  Transmission of conformational change in insulin , 1983, Nature.

[4]  F A Quiocho,et al.  Comparison of the periplasmic receptors for L-arabinose, D-glucose/D-galactose, and D-ribose. Structural and Functional Similarity. , 1991, The Journal of biological chemistry.

[5]  F M Richards,et al.  Areas, volumes, packing and protein structure. , 1977, Annual review of biophysics and bioengineering.

[6]  Sean R. Eddy,et al.  Pfam: multiple sequence alignments and HMM-profiles of protein domains , 1998, Nucleic Acids Res..

[7]  K Schulten,et al.  Investigating a back door mechanism of actin phosphate release by steered molecular dynamics , 1999, Proteins.

[8]  S J Remington,et al.  A systematic approach to the comparison of protein structures. , 1980, Journal of molecular biology.

[9]  H. Berendsen,et al.  Model‐free methods of analyzing domain motions in proteins from simulation: A comparison of normal mode analysis and molecular dynamics simulation of lysozyme , 1997, Proteins.

[10]  J L Sussman,et al.  Refined crystal structure of dogfish M4 apo-lactate dehydrogenase. , 1989, Journal of molecular biology.

[11]  C. Branden,et al.  Introduction to protein structure , 1991 .

[12]  M Gerstein,et al.  A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. , 1997, Journal of molecular biology.

[13]  G. Barton,et al.  Multiple protein sequence alignment from tertiary structure comparison: Assignment of global and residue confidence levels , 1992, Proteins.

[14]  S H Bryant,et al.  A dynamic look at structures: WWW-Entrez and the Molecular Modeling Database. , 1996, Trends in biochemical sciences.

[15]  T. P. Flores,et al.  Multiple protein structure alignment , 1994, Protein science : a publication of the Protein Society.

[16]  Mark Gerstein,et al.  Calculations of protein volumes: sensitivity analysis and parameter database , 2002, Bioinform..

[17]  F. Cohen,et al.  A surface of minimum area metric for the structural comparison of proteins. , 1996, Journal of molecular biology.

[18]  D E Wemmer,et al.  Two-state allosteric behavior in a single-domain signaling protein. , 2001, Science.

[19]  M. Perutz,et al.  Structure of Hæmoglobin: A Three-Dimensional Fourier Synthesis at 5.5-Å. Resolution, Obtained by X-Ray Analysis , 1960, Nature.

[20]  M. Levitt,et al.  The volume of atoms on the protein surface: calculated from simulation, using Voronoi polyhedra. , 1995, Journal of molecular biology.

[21]  E. Shakhnovich,et al.  Excluded volume in protein side-chain packing. , 2001, Journal of molecular biology.

[22]  G. Barton,et al.  The limits of protein secondary structure prediction accuracy from multiple sequence alignment. , 1993, Journal of molecular biology.

[23]  T Schlick,et al.  Time-trimming tricks for dynamic simulations: splitting force updates to reduce computational work. , 2001, Structure.

[24]  J Vandekerckhove,et al.  Analysis of three human interleukin 5 structures suggests a possible receptor binding mechanism , 1998, FEBS letters.

[25]  A. Bairoch,et al.  The SWISS-PROT protein sequence data bank. , 1991, Nucleic acids research.

[26]  F. Richards The interpretation of protein structures: total volume, group volume distributions and packing density. , 1974, Journal of molecular biology.

[27]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[28]  M Gerstein,et al.  Analysis of protein loop closure. Two types of hinges produce one motion in lactate dehydrogenase. , 1991, Journal of molecular biology.

[29]  J. D. Bernal,et al.  Random close-packed hard-sphere model. II. Geometry of random packing of hard spheres , 1967 .

[30]  Y. Satow,et al.  Phosphocholine binding immunoglobulin Fab McPC603. An X-ray diffraction study at 2.7 A. , 1985, Journal of molecular biology.

[31]  Mark Gerstein,et al.  SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics , 2001, Nucleic Acids Res..

[32]  W G Krebs,et al.  PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information. , 2001, Nucleic acids research.

[33]  G. Cohen Align : A program to superimpose protein coordinates, accounting for insertions and deletions , 1997 .

[34]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[35]  M Karplus,et al.  The dynamics of proteins. , 1986, Scientific American.

[36]  A M Lesk,et al.  Mechanisms of domain closure in proteins. , 1984, Journal of molecular biology.

[37]  A. Thomas,et al.  Analysis of the low-frequency normal modes of the R state of aspartate transcarbamylase and a comparison with the T state modes. , 1996, Journal of molecular biology.

[38]  G. Godefroy,et al.  Voronoi tessellation to study the numerical density and the spatial distribution of neurones , 2000, Journal of Chemical Neuroanatomy.

[39]  M. A. McClure,et al.  Hidden Markov models of biological primary sequence information. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[40]  W R Taylor,et al.  Hierarchical method to align large numbers of biological sequences. , 1990, Methods in enzymology.

[41]  M. Gerstein,et al.  The morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework. , 2000, Nucleic acids research.

[42]  D. Haussler,et al.  Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. , 1998, Journal of molecular biology.

[43]  C. Chothia,et al.  The Packing Density in Proteins: Standard Radii and Volumes , 1999 .

[44]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[45]  Mark Gerstein,et al.  Normal mode analysis of macromolecular motions in a database framework: Developing mode concentration as a useful classifying statistic , 2002, Proteins.

[46]  Wei Zu Chen,et al.  Molecular dynamics simulations of the gramicidin A-dimyristoylphosphatidylcholine system with an ion in the channel pore region , 2000, European Biophysics Journal.

[47]  M. Sternberg,et al.  An analysis of the three-dimensional structure of chicken triose phosphate isomerase. , 1977, Biochemical Society transactions.

[48]  M. Gerstein,et al.  Annotation Transfer for Genomics: Measuring Functional Divergence in Multi-Domain Proteins , 2001, Genome Research.

[49]  M. Gerstein,et al.  A database of macromolecular motions. , 1998, Nucleic acids research.

[50]  A. Crofts,et al.  Structure and function of the cytochrome bc1 complex of mitochondria and photosynthetic bacteria. , 1998, Current opinion in structural biology.

[51]  F. M. Richards,et al.  Calculation of molecular volumes and areas for structures of known geometry. , 1985, Methods in enzymology.

[52]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[53]  Hironari Kamikubo,et al.  Time-resolved x-ray diffraction reveals multiple conformations in the M-N transition of the bacteriorhodopsin photocycle. , 2000 .

[54]  P Argos,et al.  A comparison of the heme binding pocket in globins and cytochrome b5. , 1975, The Journal of biological chemistry.

[55]  S J Wodak,et al.  Pathways of ligand clearance in acetylcholinesterase by multiple copy sampling. , 2000, Journal of molecular biology.

[56]  M Levitt,et al.  Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins , 1998, Protein science : a publication of the Protein Society.

[57]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[58]  M. Karplus,et al.  Normal modes for specific motions of macromolecules: application to the hinge-bending mode of lysozyme. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[59]  B. Matthews,et al.  Accurate calculation of the density of proteins. , 2000, Acta crystallographica. Section D, Biological crystallography.

[60]  Malin M. Young,et al.  Predicting conformational switches in proteins , 1999, Protein science : a publication of the Protein Society.

[61]  M. Gribskov,et al.  Sequence Analysis Primer , 1991 .

[62]  Russ B. Altman,et al.  RIBOWEB: Linking Structural Computations to a Knowledge Base of Published Experimental Data , 1997, ISMB.

[63]  A. Thomas,et al.  Analysis of the low frequency normal modes of the T-state of aspartate transcarbamylase. , 1996, Journal of molecular biology.

[64]  R. Huber,et al.  Crystallographic refinement and atomic models of two different forms of citrate synthase at 2.7 and 1.7 A resolution. , 1984, Journal of molecular biology.

[65]  M. Gerstein,et al.  The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. , 1999, Journal of molecular biology.

[66]  A. Lesk,et al.  Elbow motion in the immunoglobulins involves a molecular ball-and-socket joint , 1988, Nature.

[67]  Russ B. Altman,et al.  Standardized Representations of the Literature: Combining Diverse Sources of Ribosomal Data , 1997, ISMB.

[68]  David T. Jones,et al.  Protein superfamilles and domain superfolds , 1994, Nature.

[69]  B. Roux,et al.  Molecular dynamics of the KcsA K(+) channel in a bilayer membrane. , 2000, Biophysical journal.

[70]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[71]  M Karplus,et al.  Anatomy of a conformational change: hinged "lid" motion of the triosephosphate isomerase loop. , 1990, Science.

[72]  R. Doolittle,et al.  Of urfs and orfs , 1986 .

[73]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[74]  D. Haussler,et al.  Genie--gene finding in Drosophila melanogaster. , 2000, Genome research.

[75]  B. Matthews,et al.  Structure of a hinge-bending bacteriophage T4 lysozyme mutant, Ile3-->Pro. , 1993, Journal of molecular biology.

[76]  Steven C. Almo,et al.  Time-resolved X-ray crystallographic study of the conformational change in Ha-Ras p21 protein on GTP hydrolysis , 1990, Nature.

[77]  P B Sigler,et al.  GroEL/GroES: structure and function of a two-stroke folding machine. , 1998, Journal of structural biology.

[78]  M. Gerstein,et al.  Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome. , 2001, Nucleic acids research.

[79]  P Argos,et al.  Structural comparisons of heme binding proteins. , 1979, Biochemistry.

[80]  M. Levitt,et al.  Protein folding: the endgame. , 1997, Annual review of biochemistry.

[81]  F M Richards,et al.  An analysis of packing in the protein folding problem , 1993, Quarterly Reviews of Biophysics.

[82]  T. Steitz,et al.  Glucose-induced conformational change in yeast hexokinase. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[83]  M. Thorpe,et al.  Rigidity theory and applications , 2002 .

[84]  J L Sussman,et al.  Open "back door" in a molecular dynamics simulation of acetylcholinesterase. , 1994, Science.

[85]  W. Bennett,et al.  Structural and functional aspects of domain motions in proteins. , 1984, CRC critical reviews in biochemistry.

[86]  C. Sander,et al.  Detection of common three‐dimensional substructures in proteins , 1991, Proteins.

[87]  P Willett,et al.  Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. , 1993, Journal of molecular biology.

[88]  R. Levy,et al.  Computer Simulations of Macromolecular Dynamics: Models for Vibrational Spectroscopy and X‐Ray Refinement a , 1986, Annals of the New York Academy of Sciences.

[89]  A. Crofts,et al.  Pathways for proton release during ubihydroquinone oxidation by the bc(1) complex. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[90]  Peter Willett,et al.  Searching techniques for databases of protein secondary structures , 1989, J. Inf. Sci..

[91]  Steven Hayward,et al.  Bending of the calmodulin central helix: A theoretical study , 1996, Protein science : a publication of the Protein Society.

[92]  M. Levitt,et al.  Realistic simulations of native-protein dynamics in solution and beyond. , 1993, Annual review of biophysics and biomolecular structure.

[93]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[94]  Richard A. Friesner,et al.  Quasi-harmonic method for calculating vibrational spectra from classical simulations on multi-dimensional anharmonic potential surfaces , 1984 .

[95]  V. Likic,et al.  Structure and dynamics of the fatty acid binding cavity in apo rat intestinal fatty acid binding protein , 1999, Protein science : a publication of the Protein Society.

[96]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[97]  M Karplus,et al.  A Dynamic Model for the Allosteric Mechanism of GroEL , 2000 .

[98]  M. Gerstein Patterns of protein‐fold usage in eight microbial genomes: A comprehensive structural census , 1998, Proteins.

[99]  T. Hahn International tables for crystallography , 2002 .

[100]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[101]  Sean R. Eddy,et al.  Maximum Discrimination Hidden Markov Models of Sequence Consensus , 1995, J. Comput. Biol..

[102]  M J Rooman,et al.  Automatic analysis of protein conformational changes by multiple linkage clustering. , 1995, Journal of molecular biology.

[103]  M. Levitt,et al.  Protein normal-mode dynamics: trypsin inhibitor, crambin, ribonuclease and lysozyme. , 1985, Journal of molecular biology.

[104]  J. Thompson,et al.  Using CLUSTAL for multiple sequence alignments. , 1996, Methods in enzymology.

[105]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[106]  Amos Bairoch,et al.  The PROSITE database, its status in 1995 , 1996, Nucleic Acids Res..

[107]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[108]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[109]  J A McCammon,et al.  Electrostatic steering of substrate to acetylcholinesterase: analysis of field fluctuations. , 2000, Biopolymers.

[110]  M. Karplus,et al.  Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[111]  M Karplus,et al.  Molecular dynamics of an alpha-helical polypeptide: Temperature dependence and deviation from harmonic behavior. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[112]  J. Janin,et al.  Structural domains in proteins and their role in the dynamics of protein function. , 1983, Progress in biophysics and molecular biology.

[113]  J L Finney,et al.  Calculation of protein volumes: an alternative to the Voronoi procedure. , 1982, Journal of molecular biology.

[114]  M J Sippl,et al.  Optimum superimposition of protein structures: ambiguities and implications. , 1996, Folding & design.

[115]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[116]  S. Wodak,et al.  Deviations from standard atomic volumes as a quality measure for protein crystal structures. , 1996, Journal of molecular biology.

[117]  W. Taylor A flexible method to align large numbers of biological sequences , 2005, Journal of Molecular Evolution.

[118]  A. Lesk,et al.  Structural mechanisms for domain movements in proteins. , 1994, Biochemistry.

[119]  M Levitt,et al.  Hierarchy of structure loss in MD simulations of src SH3 domain unfolding. , 1999, Journal of molecular biology.

[120]  F M Richards,et al.  Protein packing: dependence on protein size, secondary structure and amino acid composition. , 2000, Journal of molecular biology.

[121]  M. Noble,et al.  The crystal structure of the “open” and the “closed” conformation of the flexible loop of trypanosomal triosephosphate isomerase , 1991, Proteins.

[122]  Mark Gerstein,et al.  Determining the minimum number of types necessary to represent the sizes of protein atoms , 2001, Bioinform..

[123]  J. Sykes,et al.  Plasma lactoferrin levels in pregnancy and cystic fibrosis. , 1982, Clinica chimica acta; international journal of clinical chemistry.

[124]  M Gerstein,et al.  Advances in structural genomics. , 1999, Current opinion in structural biology.

[125]  K. Schulten,et al.  Steered molecular dynamics and mechanical functions of proteins. , 2001, Current opinion in structural biology.

[126]  R A Friesner,et al.  Large-scale ab initio quantum chemical calculations on biological systems. , 2001, Accounts of chemical research.

[127]  M. Gerstein,et al.  Average core structures and variability measures for protein families: application to the immunoglobulins. , 1995, Journal of molecular biology.

[128]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..

[129]  Wilfried Schildkamp,et al.  Structure of a Protein Photocycle Intermediate by Millisecond Time-Resolved Crystallography , 1997, Science.

[130]  S J Remington,et al.  Glycerol kinase from Escherichia coli and an Ala65-->Thr mutant: the crystal structures reveal conformational changes with implications for allosteric regulation. , 1998, Structure.

[131]  A B Thompson,et al.  Aerosolized beclomethasone in chronic bronchitis. Improved pulmonary function and diminished airway inflammation. , 1992, The American review of respiratory disease.

[132]  D. Kohtz,et al.  Conformational activation of a basic helix-loop-helix protein (MyoD1) by the C-terminal region of murine HSP90 (HSP84) , 1992, Molecular and cellular biology.

[133]  K. Hinsen Analysis of domain motions by approximate normal mode calculations , 1998, Proteins.

[134]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[135]  R Nussinov,et al.  A set of van der Waals and coulombic radii of protein atoms for molecular and solvent‐accessible surface calculation, packing evaluation, and docking , 1998, Proteins.

[136]  I. Kuntz Structure-Based Strategies for Drug Design and Discovery , 1992, Science.

[137]  M. Gerstein,et al.  LPFC: An internet library of protein family core structures , 1997, Protein science : a publication of the Protein Society.

[138]  David A. Agard,et al.  Enzyme specificity under dynamic control: A normal mode analysis of α-lytic protease , 1999 .

[139]  J A McCammon,et al.  Gating of the active site of triose phosphate isomerase: Brownian dynamics simulations of flexible peptide loops in the enzyme. , 1993, Biophysical journal.

[140]  F. Richards Packing defects, cavities, volume fluctuations, and access to the interior of proteins. Including some general comments on surface area and protein structure , 1979 .

[141]  J L Sussman,et al.  Protein Data Bank archives of three-dimensional macromolecular structures. , 1997, Methods in enzymology.

[142]  J. Janin,et al.  Computer studies of interactions between macromolecules. , 1987, Progress in biophysics and molecular biology.

[143]  B S Duncan,et al.  Approximation and visualization of large-scale motion of protein surfaces. , 1995, Journal of molecular graphics.

[144]  M Karplus,et al.  The allosteric mechanism of the chaperonin GroEL: a dynamic analysis. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[145]  M Gerstein,et al.  Volume changes on protein folding. , 1994, Structure.

[146]  C Sander,et al.  Structural alignment of globins, phycocyanins and colicin A , 1993, FEBS letters.

[147]  Y. Sanejouand,et al.  Hinge‐bending motion in citrate synthase arising from normal mode calculations , 1995, Proteins.

[148]  C. Sander,et al.  The FSSP database of structurally aligned protein fold families. , 1994, Nucleic acids research.

[149]  K. Hinsen,et al.  Tertiary and quaternary conformational changes in aspartate transcarbamylase: a normal mode study , 1999, Proteins.