SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions

Large portions of higher eukaryotic proteomes are intrinsically disordered, and abundant evidence suggests that these unstructured regions of proteins are rich in regulatory interaction interfaces. A major class of disordered interaction interfaces are the compact and degenerate modules known as short linear motifs (SLiMs). As a result of the difficulties associated with the experimental identification and validation of SLiMs, our understanding of these modules is limited, advocating the use of computational methods to focus experimental discovery. This article evaluates the use of evolutionary conservation as a discriminatory technique for motif discovery. A statistical framework is introduced to assess the significance of relatively conserved residues, quantifying the likelihood a residue will have a particular level of conservation given the conservation of the surrounding residues. The framework is expanded to assess the significance of groupings of conserved residues, a metric that forms the basis of SLiMPrints (short linear motif fingerprints), a de novo motif discovery tool. SLiMPrints identifies relatively overconstrained proximal groupings of residues within intrinsically disordered regions, indicative of putatively functional motifs. Finally, the human proteome is analysed to create a set of highly conserved putative motif instances, including a novel site on translation initiation factor eIF2A that may regulate translation through binding of eIF4E.

[1]  Kevin J. Cheung,et al.  Tumor Suppressor LATS1 Is a Negative Regulator of Oncogene YAP* , 2008, Journal of Biological Chemistry.

[2]  P. Tompa Intrinsically unstructured proteins evolve by repeat expansion , 2003, BioEssays : news and reviews in molecular, cellular and developmental biology.

[3]  Ravi Iyengar,et al.  Mutation of SHOC2 promotes aberrant protein N-myristoylation and causes Noonan-like syndrome with loose anagen hair , 2009 .

[4]  Olivier Poch,et al.  A new protein linear motif benchmark for multiple sequence alignment software , 2008, BMC Bioinformatics.

[5]  T. Pawson,et al.  Post-translational modifications in signal integration , 2010, Nature Structural &Molecular Biology.

[6]  R. Russell,et al.  Linear motifs: Evolutionary interaction switches , 2005, FEBS letters.

[7]  Peter F. Johnson,et al.  Transcriptional Activity of CCAAT/Enhancer-binding Proteins Is Controlled by a Conserved Inhibitory Domain That Is a Target for Sumoylation* , 2002, The Journal of Biological Chemistry.

[8]  Richard J. Edwards,et al.  Masking residues using context-specific evolutionary conservation significantly improves short linear motif discovery , 2009, Bioinform..

[9]  A. Keith Dunker,et al.  Mining α-Helix-Forming Molecular Recognition Features with Cross Species Sequence Alignments† , 2007 .

[10]  R. Kobayashi,et al.  Serine Phosphorylation-dependent Association of the Band 4.1-related Protein-tyrosine Phosphatase PTPH1 with 14-3-3β Protein* , 1997, The Journal of Biological Chemistry.

[11]  Sanguthevar Rajasekaran,et al.  Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences , 2011, Nucleic Acids Res..

[12]  P. Allen,et al.  Interaction of 14-3-3 with Signaling Proteins Is Mediated by the Recognition of Phosphoserine , 1996, Cell.

[13]  C. Goodman,et al.  Repulsive Axon Guidance Abelson and Enabled Play Opposing Roles Downstream of the Roundabout Receptor , 2000, Cell.

[14]  Jay Vyas,et al.  Viral infection and human disease--insights from minimotifs. , 2008, Frontiers in bioscience : a journal and virtual library.

[15]  V. M. Pain,et al.  Expression of fragments of translation initiation factor eIF4GI reveals a nuclear localisation signal within the N-terminal apoptotic cleavage fragment N-FAG , 2004, Journal of Cell Science.

[16]  P. Tompa,et al.  The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. , 2005, Journal of molecular biology.

[17]  K. Tomoo,et al.  A conserved motif within the flexible C-terminus of the translational regulator 4E-BP is required for tight binding to the mRNA cap-binding protein eIF4E. , 2012, The Biochemical journal.

[18]  Richard J. Edwards,et al.  SLiMFinder: A Probabilistic Method for Identifying Over-Represented, Convergently Evolved, Short Linear Motifs in Proteins , 2007, PloS one.

[19]  Norman E. Davey,et al.  How viruses hijack cell regulation. , 2011, Trends in biochemical sciences.

[20]  Ravi Iyengar,et al.  Mutation in SHOC2 promotes aberrant protein N-myristoylation and underlies Noonan-like syndrome with loose anagen hair , 2009, Nature Genetics.

[21]  P. Nowell,et al.  Genes on chromosomes 4, 9, and 19 involved in 11q23 abnormalities in acute leukemia share sequence homology and/or common motifs. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Gianluca Pollastri,et al.  Prediction of short linear protein binding regions. , 2012, Journal of molecular biology.

[23]  D. Kucik,et al.  Identification and kinetic analysis of the interaction between Nck‐2 and DOCK180 , 2001, FEBS letters.

[24]  Miquel Pons,et al.  Dynamic interactions of proteins in complex networks: a more structured view , 2009, The FEBS journal.

[25]  The translation initiation factor eIF-4E binds to a common motif shared by the translation factor eIF-4 gamma and the translational repressors 4E-binding proteins. , 1995, Molecular and cellular biology.

[26]  Peter Tompa,et al.  Unstructural biology coming of age. , 2011, Current opinion in structural biology.

[27]  H. Pan,et al.  Arrestin domain‐containing protein 3 recruits the NEDD4 E3 ligase to mediate ubiquitination of the β2‐adrenergic receptor , 2010, EMBO reports.

[28]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[29]  Richard J. Edwards,et al.  Computational identification and analysis of protein short linear motifs. , 2010, Frontiers in bioscience.

[30]  Niall J. Haslam,et al.  Understanding eukaryotic linear motifs and their role in cell signaling and regulation. , 2008, Frontiers in bioscience : a journal and virtual library.

[31]  Norman E. Davey,et al.  Motif switches: decision-making in cell regulation. , 2012, Current opinion in structural biology.

[32]  S. Morley,et al.  Specific Isoforms of Translation Initiation Factor 4GI Show Differences in Translational Activity , 2006, Molecular and Cellular Biology.

[33]  P. Evans,et al.  A structural explanation for the binding of multiple ligands by the alpha-adaptin appendage domain. , 1999, Cell.

[34]  R. Kriwacki,et al.  Regulation of cell division by intrinsically unstructured proteins: intrinsic flexibility, modularity, and signaling conduits. , 2008, Biochemistry.

[35]  A Keith Dunker,et al.  Mining alpha-helix-forming molecular recognition features with cross species sequence alignments. , 2007, Biochemistry.

[36]  A. Komar,et al.  Novel Characteristics of the Biological Properties of the Yeast Saccharomyces cerevisiae Eukaryotic Initiation Factor 2A* , 2005, Journal of Biological Chemistry.

[37]  A. Komar,et al.  Characterization of Mammalian eIF2A and Identification of the Yeast Homolog* , 2002, The Journal of Biological Chemistry.

[38]  M. Satake,et al.  SMAP2, a novel ARF GTPase-activating protein, interacts with clathrin and clathrin assembly protein and functions on the AP-1-positive early endosome/trans-Golgi network. , 2006, Molecular biology of the cell.

[39]  Toby J. Gibson,et al.  The identification of short linear motif-mediated interfaces within the human interactome , 2012, Bioinform..

[40]  P. Evans,et al.  A Structural Explanation for the Binding of Multiple Ligands by the α-Adaptin Appendage Domain , 1999, Cell.

[41]  Haruki Nakamura,et al.  Interaction between the Amino-terminal SH3 Domain of CRK and Its Natural Target Proteins* , 1996, The Journal of Biological Chemistry.

[42]  J. Sygusch,et al.  Mechanism of Aldolase Control of Sorting Nexin 9 Function in Endocytosis* , 2010, The Journal of Biological Chemistry.

[43]  Tony Pawson,et al.  Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases , 2009, Science Signaling.

[44]  Christopher J. Oldfield,et al.  Evolutionary Rate Heterogeneity in Proteins with Long Disordered Regions , 2002, Journal of Molecular Evolution.

[45]  Richard J. Edwards,et al.  ELM—the database of eukaryotic linear motifs , 2011, Nucleic Acids Res..

[46]  Nanxin Li,et al.  Guanine-nucleotide-releasing factor hSos1 binds to Grb2 and links receptor tyrosine kinases to Ras signalling , 1993, Nature.

[47]  A. Murray,et al.  Cyclin is degraded by the ubiquitin pathway , 1991, Nature.

[48]  Zsuzsanna Dosztányi,et al.  Prediction of Protein Binding Regions in Disordered Proteins , 2009, PLoS Comput. Biol..

[49]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[50]  Toby J. Gibson,et al.  Discovery of candidate KEN-box motifs using Cell Cycle keyword enrichment combined with native disorder prediction and motif conservation , 2008, Bioinform..

[51]  N. Sonenberg,et al.  Regulation of cap-dependent translation by eIF4E inhibitory proteins , 2005, Nature.

[52]  H. V. van Leeuwen,et al.  The human collagen beta(1-O)galactosyltransferase, GLT25D1, is a soluble endoplasmic reticulum localized protein , 2010, BMC Cell Biology.

[53]  G. Guy,et al.  Structural basis for a novel intrapeptidyl H‐bond and reverse binding of c‐Cbl‐TKB domain substrates , 2008, The EMBO journal.

[54]  H. G. Brunner,et al.  A novel D458V mutation in the SANS PDZ binding motif causes atypical Usher syndrome , 2005, Journal of Molecular Medicine.

[55]  Heinrich Sticht,et al.  A computational strategy for the prediction of functional linear peptide motifs in proteins , 2007, Bioinform..

[56]  X. Chen,et al.  ER stress induces cleavage of membrane-bound ATF6 by the same proteases that process SREBPs. , 2000, Molecular cell.

[57]  Rodrigo Lopez,et al.  A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences , 2008, BMC Bioinformatics.

[58]  Hiroaki Kitano,et al.  Biological robustness , 2008, Nature Reviews Genetics.

[59]  K. Kamiguchi,et al.  Tyrosine Phosphorylation of Crk-associated Substrates by Focal Adhesion Kinase , 1997, The Journal of Biological Chemistry.

[60]  J Schultz,et al.  SMART, a simple modular architecture research tool: identification of signaling domains. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[61]  A. Pendergast,et al.  Abi-2, a novel SH3-containing protein interacts with the c-Abl tyrosine kinase and modulates c-Abl transforming activity. , 1995, Genes & development.

[62]  A Keith Dunker,et al.  Short Linear Motifs recognized by SH2, SH3 and Ser/Thr Kinase domains are conserved in disordered protein regions , 2008, BMC Genomics.

[63]  Alan M. Moses,et al.  Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions , 2012, Science Signaling.

[64]  Richard J. Edwards,et al.  SLiMSearch 2.0: biological context for short linear motifs in proteins , 2011, Nucleic Acids Res..

[65]  Gavin MacBeath,et al.  A quantitative protein interaction network for the ErbB receptors using protein microarrays , 2006, Nature.

[66]  Norman E. Davey,et al.  Attributes of short linear motifs. , 2012, Molecular bioSystems.

[67]  David G. Karlin,et al.  Detecting Remote Sequence Homology in Disordered Proteins: Discovery of Conserved Motifs in the N-Termini of Mononegavirales phosphoproteins , 2012, PloS one.

[68]  J. Bowie,et al.  The molecular basis of the Caskin1 and Mint1 interaction with CASK. , 2011, Journal of molecular biology.

[69]  A. Hinnebusch,et al.  Molecular Mechanism of Scanning and Start Codon Selection in Eukaryotes , 2011, Microbiology and Molecular Reviews.

[70]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[71]  J. Shih,et al.  Candidate tumor suppressor DDX3 RNA helicase specifically represses cap-dependent translation by acting as an eIF4E inhibitory protein , 2008, Oncogene.

[72]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[73]  T. Pawson,et al.  Cell Signaling in Space and Time: Where Proteins Come Together and When They’re Apart , 2009, Science.

[74]  Jakub Pas,et al.  ELM: the status of the 2010 eukaryotic linear motif resource , 2009, Nucleic Acids Res..

[75]  Robert B. Russell,et al.  DILIMOT: discovery of linear motifs in proteins , 2006, Nucleic Acids Res..

[76]  Richard J. Edwards,et al.  SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs , 2010, Nucleic Acids Res..

[77]  Peter B. McGarvey,et al.  Infrastructure for the life sciences: design and implementation of the UniProt website , 2009, BMC Bioinformatics.

[78]  Richard J. Edwards,et al.  Interactome-wide prediction of short, disordered protein interaction motifs in humans. , 2012, Molecular bioSystems.

[79]  W. Anderson,et al.  Purification and characterization of homogeneous protein synthesis initiation factor M1 from rabbit reticulocytes. , 1975, The Journal of biological chemistry.

[80]  T. Hirano,et al.  Expression of the zinc ®nger gene fez-like in zebra®sh forebrain , 2000 .

[81]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[82]  Jörg Gsponer,et al.  Intrinsically disordered proteins: regulation and disease. , 2011, Current opinion in structural biology.

[83]  Richard J. Edwards,et al.  The SLiMDisc server: short, linear motif discovery in proteins , 2007, Nucleic Acids Res..

[84]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[85]  K. Tomoo,et al.  Identification and function of the second eIF4E-binding region in N-terminal domain of eIF4G: comparison with eIF4E-binding protein. , 2011, Biochemical and biophysical research communications.

[86]  Toby J Gibson,et al.  Cell regulation: determined to signal discrete cooperation. , 2009, Trends in biochemical sciences.

[87]  G. Wagner,et al.  Ribosome Loading onto the mRNA Cap Is Driven by Conformational Coupling between eIF4G and eIF4E , 2003, Cell.

[88]  J. Bonifacino,et al.  Association of the AP-3 adaptor complex with clathrin. , 1998, Science.

[89]  S. Kornfeld,et al.  Gamma subunit of the AP-1 adaptor complex binds clathrin: implications for cooperative binding in coated vesicle assembly. , 2001, Molecular biology of the cell.

[90]  K. Miyazawa,et al.  A Deubiquitinating Enzyme UBPY Interacts with the Src Homology 3 Domain of Hrs-binding Protein via a Novel Binding Motif PX(V/I)(D/N)RXXKP* , 2000, The Journal of Biological Chemistry.

[91]  A. Gingras,et al.  Cocrystal Structure of the Messenger RNA 5′ Cap-Binding Protein (eIF4E) Bound to 7-methyl-GDP , 1997, Cell.