QSLiMFinder: improved short linear motif prediction using specific query protein data

Motivation: The sensitivity of de novo short linear motif (SLiM) prediction is limited by the number of patterns (the motif space) being assessed for enrichment. QSLiMFinder uses specific query protein information to restrict the motif space and thereby increase the sensitivity and specificity of predictions. Results: QSLiMFinder was extensively benchmarked using known SLiM-containing proteins and simulated protein interaction datasets of real human proteins. Exploiting prior knowledge of a query protein likely to be involved in a SLiM-mediated interaction increased the proportion of true positives correctly returned and reduced the proportion of datasets returning a false positive prediction. The biggest improvement was seen if a short region of the query protein flanking the interaction site was known. Availability and implementation: All the tools and data used in this study, including QSLiMFinder and the SLiMBench benchmarking software, are freely available under a GNU license as part of SLiMSuite, at: http://bioware.soton.ac.uk. Contact: richard.edwards@unsw.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Victor Neduva,et al.  Peptides mediating interaction networks: new leads at last. , 2006, Current opinion in biotechnology.

[2]  Richard J. Edwards,et al.  SLiMFinder: A Probabilistic Method for Identifying Over-Represented, Convergently Evolved, Short Linear Motifs in Proteins , 2007, PloS one.

[3]  Richard J. Edwards,et al.  SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions , 2012, Nucleic acids research.

[4]  Norman E. Davey,et al.  Attributes of short linear motifs. , 2012, Molecular bioSystems.

[5]  Ignacio E. Sánchez,et al.  The eukaryotic linear motif resource ELM: 10 years and counting , 2013, Nucleic Acids Res..

[6]  Jörg Gsponer,et al.  Intrinsically disordered proteins: regulation and disease. , 2011, Current opinion in structural biology.

[7]  Ozlem Keskin,et al.  Towards inferring time dimensionality in protein–protein interaction networks by integrating structures: the p53 example† †This article is part of a Molecular BioSystems themed issue on Computational and Systems Biology. , 2009, Molecular bioSystems.

[8]  Peter Tompa,et al.  Unstructural biology coming of age. , 2011, Current opinion in structural biology.

[9]  Patrick Aloy,et al.  Contextual Specificity in Peptide-Mediated Protein Interactions , 2008, PloS one.

[10]  Niall J. Haslam,et al.  Understanding eukaryotic linear motifs and their role in cell signaling and regulation. , 2008, Frontiers in bioscience : a journal and virtual library.

[11]  M. Boxem,et al.  Identification of human protein interaction domains using an ORFeome-based yeast two-hybrid fragment library. , 2013, Journal of proteome research.

[12]  T. Gibson,et al.  Systematic Discovery of New Recognition Peptides Mediating Protein Interaction Networks , 2005, PLoS biology.

[13]  Richard J. Edwards,et al.  Estimation and efficient computation of the true probability of recurrence of short linear protein sequence motifs in unrelated proteins , 2010, BMC Bioinformatics.

[14]  Richard J. Edwards,et al.  Computational identification and analysis of protein short linear motifs. , 2010, Frontiers in bioscience.

[15]  Norman E. Davey,et al.  How viruses hijack cell regulation. , 2011, Trends in biochemical sciences.

[16]  Monika Fuxreiter,et al.  Interactions via intrinsically disordered regions: What kind of motifs? , 2012, IUBMB life.

[17]  Y. Shamoo,et al.  Structural and thermodynamic analysis of human PCNA with peptides derived from DNA polymerase-delta p66 subunit and flap endonuclease-1. , 2004, Structure.

[18]  T. Gibson,et al.  A careful disorderliness in the proteome: Sites for interaction and targets for future therapies , 2008, FEBS letters.

[19]  Richard J. Edwards,et al.  SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent , 2006, Nucleic acids research.

[20]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[21]  Olivier Elemento,et al.  Large-Scale Discovery and Characterization of Protein Regulatory Motifs in Eukaryotes , 2010, PloS one.

[22]  Arnaud Céol,et al.  3did: a catalog of domain-based interactions of known three-dimensional structure , 2013, Nucleic Acids Res..

[23]  R. Russell,et al.  Linear motifs: Evolutionary interaction switches , 2005, FEBS letters.

[24]  Richard J. Edwards,et al.  CompariMotif: quick and easy comparisons of sequence motifs , 2008, Bioinform..

[25]  Richard J. Edwards,et al.  ELM—the database of eukaryotic linear motifs , 2011, Nucleic Acids Res..

[26]  Richard J. Edwards,et al.  Interactome-wide prediction of short, disordered protein interaction motifs in humans. , 2012, Molecular bioSystems.

[27]  Richard J. Edwards,et al.  Computational prediction of short linear motifs from protein sequences. , 2015, Methods in molecular biology.

[28]  Richard J. Edwards,et al.  Masking residues using context-specific evolutionary conservation significantly improves short linear motif discovery , 2009, Bioinform..

[29]  Patrick Aloy,et al.  Novel Peptide-Mediated Interactions Derived from High-Resolution 3-Dimensional Structures , 2010, PLoS Comput. Biol..

[30]  Zsuzsanna Dosztányi,et al.  IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content , 2005, Bioinform..

[31]  Richard J. Edwards,et al.  SLiMSearch: A Webserver for Finding Novel Occurrences of Short Linear Motifs in Proteins, Incorporating Sequence Context , 2010, PRIB.

[32]  Richard J. Edwards,et al.  SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs , 2010, Nucleic Acids Res..