DisCons: a novel tool to quantify and classify evolutionary conservation of intrinsic protein disorder

BackgroundAnalyzing the amino acid sequence of an intrinsically disordered protein (IDP) in an evolutionary context can yield novel insights on the functional role of disordered regions and sequence element(s). However, in the case of many IDPs, the lack of evolutionary conservation of the primary sequence can hamper the study of functionality, because the conservation of their disorder profile and ensuing function(s) may not appear in a traditional analysis of the evolutionary history of the protein.ResultsHere we present DisCons (Disorder Conservation), a novel pipelined tool that combines the quantification of sequence- and disorder conservation to classify disordered residue positions. According to this scheme, the most interesting categories (for functional purposes) are constrained disordered residues and flexible disordered residues. The former residues show conservation of both the sequence and the property of disorder and are associated mainly with specific binding functionalities (e.g., short, linear motifs, SLiMs), whereas the latter class correspond to segments where disorder as a feature is important for function as opposed to the identity of the underlying sequence (e.g., entropic chains and linkers). DisCons therefore helps with elucidating the function(s) arising from the disordered state by analyzing individual proteins as well as large-scale proteomics datasets.ConclusionsDisCons is an openly accessible sequence analysis tool that identifies and highlights structurally disordered segments of proteins where the conformational flexibility is conserved across homologs, and therefore potentially functional. The tool is freely available both as a web application and as stand-alone source code hosted at http://pedb.vib.be/discons.

[1]  Zsuzsanna Dosztányi,et al.  Prediction of Protein Binding Regions in Disordered Proteins , 2009, PLoS Comput. Biol..

[2]  Jaime Prilusky,et al.  FoldIndex copyright: a simple tool to predict whether a given protein sequence is intrinsically unfolded , 2005, Bioinform..

[3]  Norman E. Davey,et al.  Attributes of short linear motifs. , 2012, Molecular bioSystems.

[4]  Ignacio E. Sánchez,et al.  The eukaryotic linear motif resource ELM: 10 years and counting , 2013, Nucleic Acids Res..

[5]  Lukasz A. Kurgan,et al.  MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins , 2012, Bioinform..

[6]  Zoran Obradovic,et al.  DisProt: the Database of Disordered Proteins , 2006, Nucleic Acids Res..

[7]  M. Bolognesi,et al.  Function and Structure of Inherently Disordered Proteins This Review Comes from a Themed Issue on Proteins Edited Prediction of Non-folding Proteins and Regions Frequency of Disordered Regions Protein Evolution Partitioning Unstructured Proteins and Regions into Groups Involvement of Inherently Diso , 2022 .

[8]  Zemin Zhang,et al.  Natural selection of protein structural and functional properties: a single nucleotide polymorphism perspective , 2008, Genome Biology.

[9]  Christopher J. Oldfield,et al.  Intrinsically disordered proteins in human diseases: introducing the D2 concept. , 2008, Annual review of biophysics.

[10]  Gary D Bader,et al.  Bringing order to protein disorder through comparative genomics and genetic interactions , 2011, Genome Biology.

[11]  Mona Singh,et al.  Predicting functionally important residues from sequence conservation , 2007, Bioinform..

[12]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[13]  David J. Weber,et al.  Structure of the negative regulatory domain of p53 bound to S100B(ββ) , 2000, Nature Structural Biology.

[14]  Sameer Velankar,et al.  PDBe: Protein Data Bank in Europe , 2009, Nucleic Acids Res..

[15]  Monika Fuxreiter,et al.  Close encounters of the third kind: disordered domains and the interactions of proteins , 2009, BioEssays : news and reviews in molecular, cellular and developmental biology.

[16]  P. Romero,et al.  Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions. , 2006, Journal of Proteome Research.

[17]  P. Tompa,et al.  The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. , 2005, Journal of molecular biology.

[18]  Silvio C. E. Tosatto,et al.  ESpritz: accurate and fast prediction of protein disorder , 2012, Bioinform..

[19]  S. Metallo,et al.  Intrinsically disordered proteins are potential drug targets. , 2010, Current opinion in chemical biology.

[20]  Silvio C. E. Tosatto,et al.  MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins , 2014, Nucleic Acids Res..

[21]  C. Dobson,et al.  Protein misfolding, functional amyloid, and human disease. , 2006, Annual review of biochemistry.

[22]  Kazutaka Katoh,et al.  MAFFT: iterative refinement and additional methods. , 2014, Methods in molecular biology.

[23]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[24]  Christopher J. Oldfield,et al.  Classification of Intrinsically Disordered Regions and Proteins , 2014, Chemical reviews.

[25]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[26]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[27]  Dmitri I. Svergun,et al.  pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins , 2013, Nucleic Acids Res..

[28]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[29]  Lukasz A. Kurgan,et al.  D2P2: database of disordered protein predictions , 2012, Nucleic Acids Res..

[30]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[31]  J. Beckmann,et al.  FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. , 2005, Bioinformatics.

[32]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[33]  Peter Tompa,et al.  Unstructural biology coming of age. , 2011, Current opinion in structural biology.

[34]  Marc S. Cortese,et al.  Analysis of molecular recognition features (MoRFs). , 2006, Journal of molecular biology.