LMDIPred: A web-server for prediction of linear peptide sequences binding to SH3, WW and PDZ domains

Protein-peptide interactions form an important subset of the total protein interaction network in the cell and play key roles in signaling and regulatory networks, and in major biological processes like cellular localization, protein degradation, and immune response. In this work, we have described the LMDIPred web server, an online resource for generalized prediction of linear peptide sequences that may bind to three most prevalent and well-studied peptide recognition modules (PRMs)—SH3, WW and PDZ. We have developed support vector machine (SVM)-based prediction models that achieved maximum Matthews Correlation Coefficient (MCC) of 0.85 with an accuracy of 94.55% for SH3, MCC of 0.90 with an accuracy of 95.82% for WW, and MCC of 0.83 with an accuracy of 92.29% for PDZ binding peptides. LMDIPred output combines predictions from these SVM models with predictions using Position-Specific Scoring Matrices (PSSMs) and string-matching methods using known domain-binding motif instances and regular expressions. All of these methods were evaluated using a five-fold cross-validation technique on both balanced and unbalanced datasets, and also validated on independent datasets. LMDIPred aims to provide a preliminary bioinformatics platform for sequence-based prediction of probable binding sites for SH3, WW or PDZ domains.

[1]  Shi-You Chen,et al.  Dedicator of Cytokinesis 2 in Cell Signaling Regulation and Disease Development , 2017, Journal of cellular physiology.

[2]  Robert B. Russell,et al.  DILIMOT: discovery of linear motifs in proteins , 2006, Nucleic Acids Res..

[3]  Jiunn R Chen,et al.  PDZ Domain Binding Selectivity Is Optimized Across the Mouse Proteome , 2007, Science.

[4]  Toby J. Gibson,et al.  The eukaryotic linear motif resource – 2018 update , 2017, Nucleic Acids Res..

[5]  Gary D. Bader,et al.  Bayesian Modeling of the Yeast SH3 Domain Interactome Predicts Spatiotemporal Dynamics of Endocytosis Proteins , 2009, PLoS biology.

[6]  Jonathan G. Lees,et al.  Transient protein-protein interactions: structural, functional, and network properties. , 2010, Structure.

[7]  P. Permi,et al.  SH3 domain ligand binding: What's the consensus and where's the specificity? , 2012, FEBS letters.

[8]  Philip M. Kim,et al.  Motif mediated protein-protein interactions as drug targets , 2016, Cell Communication and Signaling.

[9]  Sudipto Saha,et al.  LMPID: A manually curated database of linear motifs mediating protein–protein interactions , 2015, Database J. Biol. Databases Curation.

[10]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[11]  M. Baudry,et al.  Calpain-1 and Calpain-2: The Yin and Yang of Synaptic Plasticity and Neurodegeneration , 2016, Trends in Neurosciences.

[12]  Rolf Backofen,et al.  A graph kernel approach for alignment-free domain–peptide interaction prediction with an application to human SH3 domains , 2013, Bioinform..

[13]  Marius Sudol,et al.  WW and SH3 domains, two different scaffolds to recognize proline‐rich ligands , 2002, FEBS letters.

[14]  Niall J. Haslam,et al.  Understanding eukaryotic linear motifs and their role in cell signaling and regulation. , 2008, Frontiers in bioscience : a journal and virtual library.

[15]  W. Lim,et al.  Mechanism and role of PDZ domains in signaling complex assembly. , 2001, Journal of cell science.

[16]  Michael B. Yaffe,et al.  Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs , 2003, Nucleic Acids Res..

[17]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[18]  Songyot Nakariyakul,et al.  A sequence-based computational approach to predicting PDZ domain-peptide interactions. , 2014, Biochimica et biophysica acta.

[19]  D. Richardson,et al.  The emerging role of progesterone receptor membrane component 1 (PGRMC1) in cancer biology. , 2016, Biochimica et biophysica acta.

[20]  J. Vaquero,et al.  Role of the PDZ-scaffold protein NHERF1/EBP50 in cancer biology: from signaling regulation to clinical relevance , 2017, Oncogene.

[21]  Sanguthevar Rajasekaran,et al.  Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences , 2011, Nucleic Acids Res..

[22]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[23]  Sun Mi Kim,et al.  The Hippo signaling pathway provides novel anti-cancer drug targets , 2016, Oncotarget.

[24]  Richard J. Edwards,et al.  The SLiMDisc server: short, linear motif discovery in proteins , 2007, Nucleic Acids Res..

[25]  Javier De Las Rivas,et al.  Protein–Protein Interactions Essentials: Key Concepts to Building and Analyzing Interactome Networks , 2010, PLoS Comput. Biol..

[26]  Manuela Helmer-Citterich,et al.  iSPOT: a web tool to infer the interaction specificity of families of protein modules , 2003, Nucleic Acids Res..

[27]  Richard J. Edwards,et al.  SLiMFinder: A Probabilistic Method for Identifying Over-Represented, Convergently Evolved, Short Linear Motifs in Proteins , 2007, PloS one.

[28]  P. Maroni,et al.  Functions and Epigenetic Regulation of Wwox in Bone Metastasis from Breast Carcinoma: Comparison with Primary Tumors , 2017, International journal of molecular sciences.

[29]  W. Lim,et al.  Converging on proline: the mechanism of WW domain peptide recognition , 2000, Nature Structural Biology.

[30]  Rolf Backofen,et al.  MoDPepInt: an interactive web server for prediction of modular domain–peptide interactions , 2014, Bioinform..

[31]  R. Backofen,et al.  Semi-Supervised Prediction of SH2-Peptide Interactions from Imbalanced High-Throughput Data , 2013, PloS one.

[32]  Rolf Backofen,et al.  Cluster based prediction of PDZ-peptide interactions , 2014, BMC Genomics.