Improved amino acid flexibility parameters

Protein molecules exhibit varying degrees of flexibility throughout their three‐dimensional structures, with some segments showing little mobility while others may be so disordered as to be unresolvable by techniques such as X‐ray crystallography. Atomic displacement parameters, or B‐factors, from X‐ray crystallographic studies give an experimentally determined indication of the degree of mobility in a protein structure. To provide better estimators of amino acid flexibility, we have examined B‐factors from a large set of high‐resolution crystal structures. Because of the differences among structures, it is necessary to normalize the B‐factors. However, many proteins have segments of unusually high mobility, which must be accounted for before normalization can be performed. Accordingly, a median‐based method from quality control studies was used to identify outliers. After removal of outliers from, and normalization of, each protein chain, the B‐factors were collected for each amino acid in the set. It was found that the distribution of normalized B‐factors followed a Gumbel, or extreme value distribution, and the location parameter, or mode, of this distribution was used as an estimator of flexibility for the amino acid. These new parameters have a higher correlation with experimentally determined B‐factors than parameters from earlier methods.

[1]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[2]  Georg E. Schulz,et al.  Nucleotide binding proteins , 1979 .

[3]  R. E. Wheeler Statistical distributions , 1983, APLQ.

[4]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986, Encyclopedia of Big Data.

[5]  M Vihinen,et al.  Relationship of protein flexibility to thermostability. , 1987, Protein engineering.

[6]  R. Shiffler Maximum Z Scores and Outliers , 1988 .

[7]  S F Altschul,et al.  Significance levels for biological sequence comparison using non-linear similarity functions. , 1988, Bulletin of mathematical biology.

[8]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[9]  J. Angus Extreme Value Theory in Engineering , 1990 .

[10]  Jean-Michel Claverie,et al.  Smoothing profiles with sliding windows: better to wear a hat! , 1991, Comput. Appl. Biosci..

[11]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[12]  Teri A. Crosby,et al.  How to Detect and Handle Outliers , 1993 .

[13]  E Ruoslahti,et al.  Crystal structure of the tenth type III cell adhesion module of human fibronectin. , 1994, Journal of molecular biology.

[14]  M. Vihinen,et al.  Accuracy of protein flexibility predictions , 1994, Proteins.

[15]  Jeffrey W. Peng,et al.  [20] Investigation of protein motions via relaxation measurements , 1994 .

[16]  G. Wagner,et al.  Investigation of protein motions via relaxation measurements. , 1994, Methods in enzymology.

[17]  R M Stroud,et al.  Significance of structural changes in proteins: Expected errors in refined protein structures , 1995, Protein science : a publication of the Protein Society.

[18]  Jack D. Dunitz,et al.  Atomic Dispacement Parameter Nomenclature. Report of a Subcommittee on Atomic Displacement Parameter Nomenclature , 1996 .

[19]  Monson H. Hayes,et al.  Statistical Digital Signal Processing and Modeling , 1996 .

[20]  D. Tronrud,et al.  Knowledge-Based B-Factor Restraints for the Refinement of Proteins , 1996 .

[21]  Oliviero Carugo,et al.  Protein—protein crystal‐packing contacts , 1997, Protein science : a publication of the Protein Society.

[22]  P Argos,et al.  Correlation between side chain mobility and conformation in protein structures. , 1997, Protein engineering.

[23]  John E. Wampler Distribution Analysis of the Variation of B-Factors of X-ray Crystal Structures: Temperature and Structural Variations in Lysozyme , 1997, J. Chem. Inf. Comput. Sci..

[24]  A.K. Dunker,et al.  Identifying disordered regions in proteins from amino acid sequence , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[25]  A K Dunker,et al.  Protein disorder and the evolution of molecular recognition: theory, predictions and observations. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[26]  A K Dunker,et al.  Thousands of proteins likely to have long disordered regions. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[27]  P Argos,et al.  Accessibility to internal cavities and ligand binding sites monitored by protein crystallographic thermal factors , 1998, Proteins.

[28]  H. Dyson,et al.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. , 1999, Journal of molecular biology.

[29]  O. Carugo,et al.  Correlation between occupancy and B factor of water molecules in protein crystal structures. , 1999, Protein engineering.

[30]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[31]  V. Uversky Intrinsically Disordered Proteins , 2000 .

[32]  S. Parthasarathy,et al.  Protein thermal stability: insights from atomic displacement parameters (B values). , 2000, Protein engineering.

[33]  M. Evans Statistical Distributions , 2000 .

[34]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[35]  K. Namba Roles of partly unfolded conformations in macromolecular self‐assembly , 2001, Genes to cells : devoted to molecular & cellular mechanisms.

[36]  Christopher J. Oldfield,et al.  Intrinsically disordered protein. , 2001, Journal of molecular graphics & modelling.

[37]  J. Hoh,et al.  Predicting properties of intrinsically unstructured proteins. , 2001, Progress in biophysics and molecular biology.

[38]  O. Carugo Detection of breaking points in helices linking separate domains , 2001, Proteins.

[39]  P. Karplus,et al.  Prediction of chain flexibility in proteins , 1985, Naturwissenschaften.