The s2D method: simultaneous sequence-based prediction of the statistical populations of ordered and disordered regions in proteins.

Extensive amounts of information about protein sequences are becoming available, as demonstrated by the over 79 million entries in the UniProt database. Yet, it is still challenging to obtain proteome-wide experimental information on the structural properties associated with these sequences. Fast computational predictors of secondary structure and of intrinsic disorder of proteins have been developed in order to bridge this gap. These two types of predictions, however, have remained largely separated, often preventing a clear characterization of the structure and dynamics of proteins. Here, we introduce a computational method to predict secondary-structure populations from amino acid sequences, which simultaneously characterizes structure and disorder in a unified statistical mechanics framework. To develop this method, called s2D, we exploited recent advances made in the analysis of NMR chemical shifts that provide quantitative information about the probability distributions of secondary-structure elements in disordered states. The results that we discuss show that the s2D method predicts secondary-structure populations with an average error of about 14%. A validation on three datasets of mostly disordered, mostly structured and partly structured proteins, respectively, shows that its performance is comparable to or better than that of existing predictors of intrinsic disorder and of secondary structure. These results indicate that it is possible to perform rapid and quantitative sequence-based characterizations of the structure and dynamics of proteins through the predictions of the statistical distributions of their ordered and disordered regions.

[1]  J. Danielsson,et al.  The Alzheimer β‐peptide shows temperature‐dependent transitions between left‐handed 31‐helix, β‐strand and random coil secondary structures , 2005 .

[2]  Lewis E. Kay,et al.  New Tools Provide New Insights in NMR Studies of Protein Dynamics , 2006, Science.

[3]  Henning Stahlberg,et al.  The fold of α-synuclein fibrils , 2008, Proceedings of the National Academy of Sciences.

[4]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[5]  Zoran Obradovic,et al.  Predicting intrinsic disorder from amino acid sequence , 2003, Proteins.

[6]  W. C. Johnson,et al.  Analysis of circular dichroism spectra. , 1992, Methods in enzymology.

[7]  Marc S. Cortese,et al.  Analysis of molecular recognition features (MoRFs). , 2006, Journal of molecular biology.

[8]  A Bairoch,et al.  SWISS-PROT: connecting biomolecular knowledge via a protein database. , 2001, Current issues in molecular biology.

[9]  Christopher J. Oldfield,et al.  Classification of Intrinsically Disordered Regions and Proteins , 2014, Chemical reviews.

[10]  Roland L. Dunbrack,et al.  PONDR-FIT: a meta-predictor of intrinsically disordered amino acids. , 2010, Biochimica et biophysica acta.

[11]  Miron Livny,et al.  BioMagResBank , 2007, Nucleic Acids Res..

[12]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[13]  Jianlin Cheng,et al.  A comprehensive overview of computational protein disorder prediction methods. , 2012, Molecular bioSystems.

[14]  Christopher J. Oldfield,et al.  Intrinsically disordered protein. , 2001, Journal of molecular graphics & modelling.

[15]  Lukasz A. Kurgan,et al.  SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..

[16]  G. J. Swaminathan,et al.  Crystal structures of oligomeric forms of the IP-10/CXCL10 chemokine. , 2003, Structure.

[17]  A Keith Dunker,et al.  Characterization of molecular recognition features, MoRFs, and their binding partners. , 2007, Journal of proteome research.

[18]  R. Nussinov,et al.  Structured disorder and conformational selection , 2001, Proteins.

[19]  Jörg Gsponer,et al.  Intrinsically disordered proteins: regulation and disease. , 2011, Current opinion in structural biology.

[20]  P. Tompa,et al.  Limitations of induced folding in molecular recognition by intrinsically disordered proteins. , 2009, Chemphyschem : a European journal of chemical physics and physical chemistry.

[21]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[22]  Zoran Obradovic,et al.  DisProt: the Database of Disordered Proteins , 2006, Nucleic Acids Res..

[23]  Michele Vendruscolo,et al.  Direct Observation of the Three Regions in α-Synuclein that Determine its Membrane-Bound Behaviour , 2014, Nature Communications.

[24]  U. Singh,et al.  1. Mapping Long-Range Interactions in -Synuclein using Spin-Label NMR and Ensemble Molecular Dynamics Simulations , 2005 .

[25]  C. Dobson Protein folding and misfolding , 2003, Nature.

[26]  Monika Fuxreiter,et al.  Interactions via intrinsically disordered regions: What kind of motifs? , 2012, IUBMB life.

[27]  Jaime Prilusky,et al.  FoldIndex copyright: a simple tool to predict whether a given protein sequence is intrinsically unfolded , 2005, Bioinform..

[28]  V. Uversky,et al.  Evidence for a Partially Folded Intermediate in α-Synuclein Fibril Formation* , 2001, The Journal of Biological Chemistry.

[29]  Peter B. McGarvey,et al.  UniRef: comprehensive and non-redundant UniProt reference clusters , 2007, Bioinform..

[30]  J. Danielsson,et al.  The Alzheimer beta-peptide shows temperature-dependent transitions between left-handed 3-helix, beta-strand and random coil secondary structures. , 2005, The FEBS journal.

[31]  A. Fersht Structure and mechanism in protein science , 1998 .

[32]  K Ravi Acharya,et al.  The advantages and limitations of protein crystal structures. , 2005, Trends in pharmacological sciences.

[33]  Michael J. Ryan,et al.  Correction: Corrigendum: The oldest North American pachycephalosaurid and the hidden diversity of small-bodied ornithischian dinosaurs , 2014 .

[34]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[35]  C. Barrow,et al.  Solution structures of beta peptide and its constituent fragments: relation to amyloid deposition. , 1991, Science.

[36]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[37]  I. Hamley The amyloid beta peptide: a chemist's perspective. Role in Alzheimer's and fibrillization. , 2012, Chemical reviews.

[38]  G. C. Walker,et al.  Regulation of Escherichia coli SOS mutagenesis by dimeric intrinsically disordered umuD gene products , 2008, Proceedings of the National Academy of Sciences.

[39]  Mariusz Jaremko,et al.  Predictive atomic resolution descriptions of intrinsically disordered hTau40 and α-synuclein in solution from NMR and small angle scattering. , 2014, Structure.

[40]  Tom Lenaerts,et al.  From protein sequence to dynamics and disorder with DynaMine , 2013, Nature Communications.

[41]  Daniel W. A. Buchan,et al.  Scalable web services for the PSIPRED Protein Analysis Workbench , 2013, Nucleic Acids Res..

[42]  C. Barrow,et al.  Solution conformations and aggregational properties of synthetic amyloid beta-peptides of Alzheimer's disease. Analysis of circular dichroism spectra. , 1992, Journal of molecular biology.

[43]  M. Karplus,et al.  Molecular dynamics and protein function. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  L. M. Espinoza-Fonseca,et al.  Reconciling binding mechanisms of intrinsically disordered proteins. , 2009, Biochemical and biophysical research communications.

[45]  Michele Vendruscolo,et al.  Accurate random coil chemical shifts from an analysis of loop regions in native states of proteins. , 2009, Journal of the American Chemical Society.

[46]  H. Shao,et al.  Solution structures of micelle-bound amyloid beta-(1-40) and beta-(1-42) peptides of Alzheimer's disease. , 1999, Journal of molecular biology.

[47]  Anna Tramontano,et al.  Assessment of protein disorder region predictions in CASP10 , 2014, Proteins.

[48]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[49]  J. Beckmann,et al.  FoldIndex©: a simple tool to predict whether a given protein sequence is intrinsically unfolded , 2005 .

[50]  Guang-Bin Huang,et al.  Learning capability and storage capacity of two-hidden-layer feedforward networks , 2003, IEEE Trans. Neural Networks.

[51]  V. Uversky Intrinsically Disordered Proteins , 2014 .

[52]  Carlo Camilloni,et al.  Determination of secondary structure populations in disordered states of proteins using nuclear magnetic resonance chemical shifts. , 2012, Biochemistry.

[53]  P. Tompa The interplay between structure and function in intrinsically unstructured proteins , 2005, FEBS letters.

[54]  Lukasz Kurgan,et al.  RAPID: fast and accurate sequence-based prediction of intrinsic disorder content on proteomic scale. , 2013, Biochimica et biophysica acta.

[55]  István Simon,et al.  Preformed structural elements feature in partner recognition by intrinsically unstructured proteins. , 2004, Journal of molecular biology.

[56]  A. Keith Dunker,et al.  Intrinsic Disorder in the Protein Data Bank , 2007, Journal of biomolecular structure & dynamics.

[57]  Gianluca Pollastri,et al.  SCLpred: protein subcellular localization prediction by N-to-1 neural networks , 2011, Bioinform..

[58]  H. Dyson,et al.  Coupling of folding and binding for unstructured proteins. , 2002, Current opinion in structural biology.

[59]  Do-Hyoung Kim,et al.  Understanding pre-structured motifs (PreSMos) in intrinsically unfolded proteins. , 2012, Current protein & peptide science.

[60]  B. Rost,et al.  Loopy proteins appear conserved in evolution. , 2002, Journal of molecular biology.

[61]  P. Tompa,et al.  Fuzzy complexes: polymorphism and structural disorder in protein-protein interactions. , 2008, Trends in biochemical sciences.

[62]  Michele Vendruscolo,et al.  Dynamic Visions of Enzymatic Reactions , 2006, Science.

[63]  Beáta Bugyi,et al.  Spire and Cordon-bleu: multifunctional regulators of actin dynamics. , 2008, Trends in cell biology.

[64]  Christopher J. Oldfield,et al.  Intrinsically disordered proteins in human diseases: introducing the D2 concept. , 2008, Annual review of biophysics.

[65]  Lisa D. Cabrita,et al.  In-Cell NMR Characterization of the Secondary Structure Populations of a Disordered Conformation of α-Synuclein within E. coli Cells , 2013, PloS one.

[66]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[67]  M. Bolognesi,et al.  Function and Structure of Inherently Disordered Proteins This Review Comes from a Themed Issue on Proteins Edited Prediction of Non-folding Proteins and Regions Frequency of Disordered Regions Protein Evolution Partitioning Unstructured Proteins and Regions into Groups Involvement of Inherently Diso , 2022 .

[68]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[69]  Michele Vendruscolo,et al.  Structural biology. Dynamic visions of enzymatic reactions. , 2006, Science.

[70]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[71]  Marc S. Cortese,et al.  Coupled folding and binding with α-helix-forming molecular recognition elements , 2005 .

[72]  Sonia Longhi,et al.  BMC Genomics , 2003 .

[73]  Avner Schlessinger,et al.  PredictProtein—an open resource for online prediction of protein structural and functional features , 2014, Nucleic Acids Res..

[74]  T. Gibson,et al.  Protein disorder prediction: implications for structural proteomics. , 2003, Structure.

[75]  Zsuzsanna Dosztányi,et al.  IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content , 2005, Bioinform..

[76]  Sonia Longhi,et al.  A practical overview of protein disorder prediction methods , 2006, Proteins.

[77]  Piero Fariselli,et al.  Improving the detection of transmembrane β-barrel chains with N-to-1 extreme learning machines , 2011, Bioinform..

[78]  A. Dunker,et al.  Predicting intrinsic disorder in proteins: an overview , 2009, Cell Research.

[79]  R. Nussinov,et al.  The role of dynamic conformational ensembles in biomolecular recognition. , 2009, Nature chemical biology.

[80]  Michael K Gilson,et al.  Protein folding and binding: from biology to physics and back again. , 2011, Current opinion in structural biology.

[81]  Wouter de Laat,et al.  Linker length and composition influence the flexibility of Oct‐1 DNA binding , 1997, The EMBO journal.

[82]  P. Wolynes,et al.  The energy landscapes and motions of proteins. , 1991, Science.

[83]  P. Tompa,et al.  Introducing protein intrinsic disorder. , 2014, Chemical reviews.

[84]  Christian Cole,et al.  The Jpred 3 secondary structure prediction server , 2008, Nucleic Acids Res..