DisProt: intrinsic protein disorder annotation in 2020

Abstract The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the ‘dark’ proteome.

Silvio C. E. Tosatto | Arne Elofsson | Marco Necci | Damiano Piovesan | Nevena Veljkovic | Patrick Ruch | Cristina Marino Buslje | Julien Gobeill | Emilie Pasche | Giovanni Minervini | Christos A. Ouzounis | Matteo Lambrughi | Elena Papaleo | Vasilis J. Promponas | Zsuzsanna Dosztányi | Bálint Mészáros | Lisanna Paladin | Valentin Iglesias | Salvador Ventura | Peter Tompa | A. Keith Dunker | Wim F. Vranken | Claudio Bassot | Tamás Horváth | Agnes Tantos | Beata Szabo | Ivan Micetic | Norman E. Davey | Alexander Miguel Monzon | Gustavo D. Parisi | Eva Schad | Emanuela Leonardi | Federica Quaglia | Elizabeth Martínez-Pérez | Rita Pancsa | Anastasia Chasapi | Tamas Lazar | Mainak Guharoy | Andrey V. Kajava | Sandra Macedo-Ribeiro | Jordi Pujols | John Lamb | Stella Tamana | Borbála Hajdu-Soltész | András Hatos | Lucía B. Chemes | Tamás Szaniszló | Radoslav Davidovic | Nicolas Palopoli | José A. Manso | Martina Bevilacqua | C. Ouzounis | Damiano Piovesan | E. Papaleo | A. Dunker | C. M. Buslje | N. Davey | P. Ruch | P. Tompa | A. Elofsson | Z. Dosztányi | W. Vranken | M. Guharoy | V. Promponas | M. Lambrughi | A. M. Monzon | S. Tosatto | G. Parisi | G. Sudha | Nicolás Palopoli | B. Szabó | R. Pancsa | S. Ventura | É. Schád | Valentín Iglesias | L. Paladin | Bálint Mészáros | Burcu Aykaç Fas | Emiliano Maiani | S. Macedo-Ribeiro | M. Salvatore | J. Manso | N. Veljkovic | A. Chasapi | G. Minervini | Federica Quaglia | A. Kajava | E. Pasche | J. Gobeill | Tamas Lazar | Elizabeth Martínez-Pérez | András Hatos | Tamás Szaniszló | E. Leonardi | P. Pereira | Jordi Pujols | I. Mičetić | M. Necci | R. Davidović | Mauricio Macossay-Castillo | Nikoletta Murvai | Á. Tantos | Borbála Hajdu-Soltész | Lucía Álvarez | Claudio Bassot | Guillermo I. Benítez | Martina Bevilacqua | N. S. G. Foutel | Tamás Horváth | Orsolya P Kovács | J. Lamb | Jeremy Y. Leclercq | Mátyás Pajkos | S. Tamana | Emiliano Maiani | Govindarajan Sudha | Lucía Álvarez | Nicolás S. González Foutel | Orsolya P. Kovacs | Mauricio Macossay-Castillo | Nikoletta Murvai | Mátyás Pajkos | Pedro J. Barbosa Pereira | Marco Salvatore | Patrick Ruch | L. Chemes | R. Davidovic | F. Quaglia | Orsolya P. Kovács | Sandra Macedo-Ribeiro | Elena Papaleo | Anastasia Chasapi | Rita Pancsa | Beáta Szabó | Govindarajan Sudha

[1]  Alexander Sczyrba,et al.  Common ELIXIR Service for Researcher Authentication and Authorisation , 2018, F1000Research.

[2]  Vijay S Pande,et al.  Finding Our Way in the Dark Proteome. , 2016, Journal of the American Chemical Society.

[3]  A. Reményi,et al.  Systematic analysis of somatic mutations driving cancer: Uncovering functional protein regions in disease development , 2016 .

[4]  Silvio C. E. Tosatto,et al.  The Pfam protein families database in 2019 , 2018, Nucleic Acids Res..

[5]  Weilin Zhang,et al.  Targeting intrinsically disordered proteins at the edge of chaos. , 2019, Drug discovery today.

[6]  Sonia Longhi,et al.  DisProt 7.0: a major update of the database of disordered proteins , 2016, Nucleic Acids Res..

[7]  Christopher J. Oldfield,et al.  Classification of Intrinsically Disordered Regions and Proteins , 2014, Chemical reviews.

[8]  Erzsébet Fichó,et al.  MFIB: a repository of protein complexes with mutual folding induced by binding , 2017, Bioinform..

[9]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[10]  Silvio C. E. Tosatto,et al.  A comprehensive assessment of long intrinsic protein disorder from the DisProt database , 2018, Bioinform..

[11]  Robert D. Finn,et al.  The challenge of increasing Pfam coverage of the human proteome , 2013, Database J. Biol. Databases Curation.

[12]  Sophia Ananiadou,et al.  Europe PMC: a full-text literature database for the life sciences and platform for innovation , 2014, Nucleic Acids Res..

[13]  Katrine Bugge,et al.  Extreme disorder in an ultrahigh-affinity protein complex , 2018, Nature.

[14]  H. Dyson,et al.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. , 1999, Journal of molecular biology.

[15]  C. Brangwynne,et al.  Liquid phase condensation in cell physiology and disease , 2017, Science.

[16]  Lukasz Kurgan,et al.  Untapped Potential of Disordered Proteins in Current Druggable Human Proteome. , 2016, Current drug targets.

[17]  M. Madan Babu,et al.  The contribution of intrinsically disordered regions to protein function, cellular complexity, and human disease , 2016, Biochemical Society transactions.

[18]  Norman E. Davey,et al.  The functional importance of structure in unstructured protein regions. , 2019, Current opinion in structural biology.

[19]  P. Tompa The interplay between structure and function in intrinsically unstructured proteins , 2005, FEBS letters.

[20]  A. Mapp,et al.  From Fuzzy to Function: The New Frontier of Protein-Protein Interactions. , 2017, Accounts of chemical research.

[21]  B. Rost,et al.  Unexpected features of the dark proteome , 2015, Proceedings of the National Academy of Sciences.

[22]  K. Kavanagh,et al.  Structure and Mechanism of Human UDP-glucose 6-Dehydrogenase , 2011, The Journal of Biological Chemistry.

[23]  Anna Tramontano,et al.  Assessment of protein disorder region predictions in CASP10 , 2014, Proteins.

[24]  Silvio C. E. Tosatto,et al.  MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins , 2017, Nucleic Acids Res..

[25]  Philipp Selenko,et al.  Structural disorder of monomeric α-synuclein persists in mammalian cells , 2016, Nature.

[26]  Zsuzsanna Dosztányi,et al.  DIBS: a repository of disordered binding sites mediating interactions with ordered proteins , 2017, Bioinform..

[27]  M. Franzblau,et al.  Conflict of Interest Statement , 2004 .

[28]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[29]  Patrick Ruch,et al.  neXtA5: accelerating annotation of articles via automated approaches in neXtProt , 2016, Database J. Biol. Databases Curation.

[30]  The UniProt Consortium,et al.  UniProt: a worldwide hub of protein knowledge , 2018, Nucleic Acids Res..

[31]  S. Harvey,et al.  The entropic force generated by intrinsically disordered segments tunes protein function , 2018, Nature.

[32]  A K Dunker,et al.  Thousands of proteins likely to have long disordered regions. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[33]  Yongqi Huang,et al.  Features of molecular recognition of intrinsically disordered proteins via coupled folding and binding , 2019, Protein science : a publication of the Protein Society.

[34]  Silvio C. E. Tosatto,et al.  Where differences resemble: sequence-feature analysis in curated databases of intrinsically disordered proteins , 2018, Database J. Biol. Databases Curation.

[35]  Dragan Ivanovic,et al.  Towards the Information System for Research Programmes of the Ministry of Education, Science and Technological Development of the Republic of Serbia , 2017, CRIS.

[36]  David A. Lee,et al.  Gene3D: Extensive prediction of globular domains in proteins , 2017, Nucleic Acids Res..

[37]  Silvio C. E. Tosatto,et al.  Large‐scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe , 2016, Protein science : a publication of the Protein Society.

[38]  Toby J. Gibson,et al.  The eukaryotic linear motif resource – 2018 update , 2017, Nucleic Acids Res..

[39]  Damiano Piovesan,et al.  INGA 2.0: improving protein function prediction for the dark proteome , 2019, Nucleic Acids Res..

[40]  D. Selkoe,et al.  α-Synuclein occurs physiologically as a helically folded tetramer that resists aggregation , 2011, Nature.