Critical assessment of protein intrinsic disorder prediction

Intrinsically disordered proteins defying the traditional protein structure-function paradigm represent a challenge to study experimentally. As a large part of our knowledge rests on computational predictions, it is crucial for their accuracy to be high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in predicting intrinsically disordered regions in proteins and the subset of disordered residues involved in binding other molecules. A total of 43 methods, 32 for disorder and 11 for binding regions, were evaluated on a dataset of 646 novel manually curated proteins from DisProt. The best methods use deep learning techniques and significantly outperform widely used earlier physicochemical methods across different types of targets. Disordered binding regions remain hard to predict correctly. Depending on the definition used, the top disorder predictor has an FMax of 0.483 (DisProt) or 0.792 (DisProt-PDB). As the top binding predictor only attains an FMax of 0.231, this suggests significant potential for improvement. Intriguingly, computing times among the top performing methods vary by up to four orders of magnitude.

Silvio C. E. Tosatto | Marco Necci | Damiano Piovesan | Silvio C.E. Tosatto | Norman E. Davey | M. Y. Lobanov | David C. Jones | K. Paliwal | Sheng Wang | Jinbo Xu | D. Cozzetto | Yaoqi Zhou | Damiano Piovesan | Jianlin Cheng | A. Dunker | Lukasz Kurgan | P. Tompa | J. Gsponer | O. Galzitskaya | A. Elofsson | M. Vendruscolo | Tristan Bitard-Feildel | Z. Dosztányi | M. Guharoy | V. Promponas | M. Lambrughi | A. M. Monzon | S. Tosatto | G. Parisi | C. Marino-Buslje | Nicolás Palopoli | S. Ventura | É. Schád | Zhenling Peng | Valentín Iglesias | L. Paladin | Bálint Mészáros | Zhonghua Wu | Gang Hu | Nawar Malhis | Emiliano Maiani | Kui Wang | M. Salvatore | J. Manso | S. Iqbal | Claudio Mirabello | N. Veljkovic | A. Chasapi | G. Minervini | I. Callebaut | A. Kajava | Ian Walsh | Tamas Lazar | Elizabeth Martínez-Pérez | András Hatos | Tianqi Wu | Tamás Szaniszló | D. Raimondi | Ronesh Sharma | E. Leonardi | Jing Yan | P. Pereira | Jordi Pujols | I. Mičetić | M. Necci | Mauricio Macossay-Castillo | Nikoletta Murvai | Á. Tantos | Borbála Hajdu-Soltész | Lucía Álvarez | Claudio Bassot | Guillermo I. Benítez | Martina Bevilacqua | Tamás Horváth | J. Lamb | Jeremy Y. Leclercq | Mátyás Pajkos | S. Tamana | Fanchi Meng | Gábor Erdős | S. Govindarajan | Jianzhao Gao | Jack Hanson | Thomas Litfin | L. Chemes | R. Davidovic | Chen Wang | Pietro Sormanni | Nicolás S. González-Foutel | F. Quaglia | Alok Sharma | Orsolya P. Kovács | Md Tamjidul Ian Sumaiya Michele Pietro Chen Daniele R Hoque Walsh Iqbal Vendruscolo Sormanni Wang Ra | Vasilis J. Stella Cristina Elizabeth Anastasia Christos A. Promponas Tamana Marino-Buslje Martínez-P | Sandra Macedo-Ribeiro | Wim Vranken | Elena Papaleo | Md Tamjidul Hoque | Björn Wallner | Anastasia Chasapi | Gabriele Orlando | Christos A Ouzounis | Burcu Aykac-Fas | Rita Pancsa | Beáta Szabó | Md. Tamjidul Hoque | Wim F. Vranken | Giovanni Minervini | Jörg Gsponer

[1]  Zsuzsanna Dosztányi,et al.  IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding , 2018, Nucleic Acids Res..

[2]  Marc Vidal,et al.  Increasing specificity in high-throughput yeast two-hybrid experiments. , 2004, Methods.

[3]  C. Chennubhotla,et al.  Intrinsic dynamics of enzymes in the unbound state and relation to allosteric regulation. , 2007, Current opinion in structural biology.

[4]  Vladimir N. Uversky,et al.  Conditionally and transiently disordered proteins: awakening cryptic disorder to regulate protein function. , 2014, Chemical reviews.

[5]  E. Rhoades,et al.  A functional role for intrinsic disorder in the tau-tubulin complex , 2016, Proceedings of the National Academy of Sciences.

[6]  David A. Lee,et al.  Gene3D: Extensive prediction of globular domains in proteins , 2017, Nucleic Acids Res..

[7]  Mona Singh,et al.  Predicting functionally important residues from sequence conservation , 2007, Bioinform..

[8]  M. Madan Babu,et al.  A million peptide motifs for the molecular biologist. , 2014, Molecular cell.

[9]  Silvio C. E. Tosatto,et al.  InterPro in 2019: improving coverage, classification and access to protein sequence annotations , 2018, Nucleic Acids Res..

[10]  Zsuzsanna Dosztányi,et al.  DIBS: a repository of disordered binding sites mediating interactions with ordered proteins , 2017, Bioinform..

[11]  Peter Tompa,et al.  Structure and Function of Intrinsically Disordered Proteins , 2009 .

[12]  Norman E. Davey,et al.  ELM—the eukaryotic linear motif resource in 2020 , 2019, Nucleic Acids Res..

[13]  Zoran Obradovic,et al.  Length-dependent prediction of protein intrinsic disorder , 2006, BMC Bioinformatics.

[14]  Abhik Mukhopadhyay,et al.  PDBe: improved accessibility of macromolecular structure data from PDB and EMDB , 2015, Nucleic Acids Res..

[15]  H. Dyson,et al.  Coupling of folding and binding for unstructured proteins. , 2002, Current opinion in structural biology.

[16]  Jianhong Zhou,et al.  Identification of Intrinsic Disorder in Complexes from the Protein Data Bank , 2018, 2018 IEEE 8th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

[17]  Silvio C. E. Tosatto,et al.  ESpritz: accurate and fast prediction of protein disorder , 2012, Bioinform..

[18]  Erzsébet Fichó,et al.  MFIB: a repository of protein complexes with mutual folding induced by binding , 2017, Bioinform..

[19]  R. J. Williams The conformational mobility of proteins and its functional significance. , 1978, Biochemical Society transactions.

[20]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[21]  Sonia Longhi,et al.  Simultaneous quantification of protein order and disorder. , 2017, Nature chemical biology.

[22]  Jörg Gsponer,et al.  MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences , 2016, Nucleic Acids Res..

[23]  Marc S. Cortese,et al.  Rational drug design via intrinsically disordered protein. , 2006, Trends in biotechnology.

[24]  T. Pawson,et al.  Cell Signaling in Space and Time: Where Proteins Come Together and When They’re Apart , 2009, Science.

[25]  Silvio C. E. Tosatto,et al.  Mobi 2.0: an improved method to define intrinsic disorder, mobility and linear binding regions in protein structures , 2018, Bioinform..

[26]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[27]  Philipp Selenko,et al.  Structural Biology outside the box-inside the cell. , 2017, Current opinion in structural biology.

[28]  Ronesh Sharma,et al.  OPAL: prediction of MoRF regions in intrinsically disordered protein sequences , 2018, Bioinform..

[29]  Xiaolong Wang,et al.  A comprehensive review and comparison of existing computational methods for intrinsically disordered protein and region prediction , 2019, Briefings Bioinform..

[30]  Damiano Piovesan,et al.  Experimentally Determined Long Intrinsically Disordered Protein Regions Are Now Abundant in the Protein Data Bank , 2020, bioRxiv.

[31]  Silvio C. E. Tosatto,et al.  MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins , 2017, Nucleic Acids Res..

[32]  I. Felli,et al.  Intrinsically Disordered Proteins Studied by NMR Spectroscopy , 2015, Advances in Experimental Medicine and Biology.

[33]  B. Schuler,et al.  Conformational Plasticity of Hepatitis C Virus Core Protein Enables RNA-Induced Formation of Nucleocapsid-like Particles. , 2017, Journal of molecular biology.

[34]  Kyou-Hoon Han,et al.  Local Structural Elements in the Mostly Unstructured Transcriptional Activation Domain of Human p53* , 2000, The Journal of Biological Chemistry.

[35]  H. Dyson,et al.  Intrinsically disordered proteins in cellular signalling and regulation , 2014, Nature Reviews Molecular Cell Biology.

[36]  Christopher J. Oldfield,et al.  Intrinsically disordered proteins and multicellular organisms. , 2015, Seminars in cell & developmental biology.

[37]  Lukasz Kurgan,et al.  High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder , 2015, Nucleic acids research.

[38]  Christopher J. Oldfield,et al.  Classification of Intrinsically Disordered Regions and Proteins , 2014, Chemical reviews.

[39]  Y. Ivarsson,et al.  High-throughput methods for identification of protein-protein interactions involving short linear motifs , 2015, Cell Communication and Signaling.

[40]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[41]  Anna Tramontano,et al.  Assessment of protein disorder region predictions in CASP10 , 2014, Proteins.

[42]  A. Dunker,et al.  Identification of Intrinsic Disorder in Complexes from the Protein Data Bank , 2018, ACS omega.

[43]  Peter Tompa,et al.  Unstructural biology coming of age. , 2011, Current opinion in structural biology.

[44]  Silvio C. E. Tosatto,et al.  DisProt: intrinsic protein disorder annotation in 2020 , 2019, Nucleic Acids Res..

[45]  Christopher J. Oldfield,et al.  Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions. , 2007, Journal of proteome research.

[46]  D. Mehta Highlight negative results to improve science. , 2019, Nature.

[47]  J. Forman-Kay,et al.  From sequence and forces to structure, function, and evolution of intrinsically disordered proteins. , 2013, Structure.

[48]  V. Buchman,et al.  Part II: α-synuclein and its molecular pathophysiological role in neurodegenerative disease , 2003, Neuropharmacology.

[49]  Motonori Ota,et al.  IDEAL in 2014 illustrates interaction networks composed of intrinsically disordered proteins and their binding partners , 2013, Nucleic Acids Res..

[50]  V. Buchman,et al.  Part II: alpha-synuclein and its molecular pathophysiological role in neurodegenerative disease. , 2003, Neuropharmacology.

[51]  Silvio C. E. Tosatto,et al.  Large‐scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe , 2016, Protein science : a publication of the Protein Society.

[52]  Vladimir N Uversky,et al.  Intrinsically disordered proteins and novel strategies for drug discovery , 2012, Expert opinion on drug discovery.

[53]  Aidan Budd,et al.  Short linear motifs: ubiquitous and functionally diverse protein interaction modules directing cell regulation. , 2014, Chemical reviews.

[54]  T. Gibson,et al.  Protein disorder prediction: implications for structural proteomics. , 2003, Structure.

[55]  David T. Jones,et al.  DISOPRED3: precise disordered region predictions with annotated protein-binding activity , 2014, Bioinform..

[56]  Daniel W. A. Buchan,et al.  A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.

[57]  Antonio Deiana,et al.  Intrinsically disordered proteins and structured proteins with intrinsically disordered regions have different functional roles in the cell , 2019 .