Coupled folding and binding with α-helix-forming molecular recognition elements

Many protein-protein and protein-nucleic acid interactions involve coupled folding and binding of at least one of the partners. Here, we propose a protein structural element or feature that mediates the binding events of initially disordered regions. This element consists of a short region that undergoes coupled binding and folding within a longer region of disorder. We call these features "molecular recognition elements" (MoREs). Examples of MoREs bound to their partners can be found in the alpha-helix, beta-strand, polyproline II helix, or irregular secondary structure conformations, and in various mixtures of the four structural forms. Here we describe an algorithm that identifies regions having propensities to become alpha-helix-forming molecular recognition elements (alpha-MoREs) based on a discriminant function that indicates such regions while giving a low false-positive error rate on a large collection of structured proteins. Application of this algorithm to databases of genomics and functionally annotated proteins indicates that alpha-MoREs are likely to play important roles protein-protein interactions involved in signaling events.

[1]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[3]  Georg E. Schulz,et al.  Nucleotide binding proteins , 1979 .

[4]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[5]  David Eisenberg,et al.  The helical hydrophobic moment: a measure of the amphiphilicity of a helix , 1982, Nature.

[6]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[7]  A. Smith,et al.  A potent synthetic peptide inhibitor of the cAMP-dependent protein kinase. , 1986, The Journal of biological chemistry.

[8]  K. Murti,et al.  Antibodies against Sendai virus L protein: distribution of the protein in nucleocapsids revealed by immunoelectron microscopy. , 1988, Virology.

[9]  I D Campbell,et al.  The structure and function of protein modules. , 1991, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[10]  D. Kolakofsky,et al.  The hypervariable C-terminal tail of the Sendai paramyxovirus nucleocapsid protein is required for template function but not for RNA encapsidation , 1993, Journal of virology.

[11]  B. Pontius Close encounters: why unstructured, polymeric domains can increase rates of specific macromolecular association. , 1993, Trends in biochemical sciences.

[12]  P. Slocombe,et al.  The activity of the tissue inhibitors of metalloproteinases is regulated by C-terminal domain interactions: a kinetic analysis of the inhibition of gelatinase A. , 1993, Biochemistry.

[13]  P. Jeffrey,et al.  Crystal structure of a p53 tumor suppressor-DNA complex: understanding tumorigenic mutations. , 1994, Science.

[14]  C. Retzler,et al.  The carboxy-terminal domain of Sendai virus nucleocapsid protein is involved in complex formation between phosphoprotein and nucleocapsid-like particles. , 1994, Virology.

[15]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[16]  M. Vihinen,et al.  Accuracy of protein flexibility predictions , 1994, Proteins.

[17]  R. N. Harty,et al.  Measles virus phosphoprotein (P) requires the NH2- and COOH-terminal domains for interactions with the nucleoprotein (N) but only the COOH terminus for interactions with itself. , 1995, The Journal of general virology.

[18]  The translation initiation factor eIF-4E binds to a common motif shared by the translation factor eIF-4 gamma and the translational repressors 4E-binding proteins. , 1995, Molecular and cellular biology.

[19]  D. V. Lindley,et al.  Applied nonparametric statistical methods , 1988 .

[20]  Tony Pawson,et al.  Protein modules and signalling networks , 1995, Nature.

[21]  Matthew J. Brauer,et al.  Modulation of apoptosis by the widely distributed Bcl-2 homologue Bak , 1995, Nature.

[22]  Anna Tempczyk,et al.  Crystal structures of human calcineurin and the human FKBP12–FK506–calcineurin complex , 1995, Nature.

[23]  N. Pavletich,et al.  Crystal structure of the tetramerization domain of the p53 tumor suppressor at 1.7 angstroms , 1995, Science.

[24]  A. Levine,et al.  Structure of the MDM2 Oncoprotein Bound to the p53 Tumor Suppressor Transactivation Domain , 1996, Science.

[25]  M. Billeter,et al.  Domains of the measles virus N protein required for binding to P protein and self-assembly. , 1996, Virology.

[26]  R. Lamb,et al.  Orthomyxoviridae: The Viruses and Their Replication. , 1996 .

[27]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[28]  P. Lansbury,et al.  NACP, a protein implicated in Alzheimer's disease and learning, is natively unfolded. , 1996, Biochemistry.

[29]  R. Meadows,et al.  Structure of Bcl-xL-Bak Peptide Complex: Recognition Between Regulators of Apoptosis , 1997, Science.

[30]  David M. Heery,et al.  A signature motif in transcriptional co-activators mediates binding to nuclear receptors , 1997, Nature.

[31]  P. Polakis,et al.  Loss of beta-catenin regulation by the APC tumor suppressor protein correlates with loss of structure due to common somatic mutations of the gene. , 1997, Cancer research.

[32]  Romero,et al.  Sequence Data Analysis for Long Disordered Regions Prediction in the Calcineurin Family. , 1997, Genome informatics. Workshop on Genome Informatics.

[33]  D. Kuntz,et al.  A Single Chain Fv Fragment of P-glycoprotein-specific Monoclonal Antibody C219 , 1997, The Journal of Biological Chemistry.

[34]  S. Fesik,et al.  Bad is a BH3 domain-containing protein that forms an inactivating dimer with Bcl-XL , 1997, Molecular and cellular biology.

[35]  A.K. Dunker,et al.  Identifying disordered regions in proteins from amino acid sequence , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[36]  G. Labesse,et al.  Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives , 1997, Cellular and Molecular Life Sciences CMLS.

[37]  T. Willson,et al.  Ligand binding and co-activator assembly of the peroxisome proliferator-activated receptor-γ , 1998, Nature.

[38]  A K Dunker,et al.  Protein disorder and the evolution of molecular recognition: theory, predictions and observations. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[39]  A K Dunker,et al.  Thousands of proteins likely to have long disordered regions. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[40]  C. Thompson,et al.  Bcl-2-family proteins: the role of the BH3 domain in apoptosis. , 1998, Trends in cell biology.

[41]  G. Wagner,et al.  The interaction of eIF4E with 4E‐BP1 is an induced fit to a completely disordered protein , 1998, Protein science : a publication of the Protein Society.

[42]  V. Dixit,et al.  Caspase-9, Bcl-XL, and Apaf-1 Form a Ternary Complex* , 1998, The Journal of Biological Chemistry.

[43]  D. Rio,et al.  Interaction between Subunits of Heterodimeric Splicing Factor U2AF Is Essential In Vivo , 1998, Molecular and Cellular Biology.

[44]  A. Gingras,et al.  4E binding proteins inhibit the translation factor eIF4E without folded structure. , 1998, Biochemistry.

[45]  A. Bren,et al.  The N terminus of the flagellar switch protein, FliM, is the binding domain for the chemotactic response regulator, CheY. , 1998, Journal of molecular biology.

[46]  M. Tsurudome,et al.  Mapping of domains on the human parainfluenza virus type 2 nucleocapsid protein (NP) required for NP-phosphoprotein or NP-NP interaction. , 1999, The Journal of general virology.

[47]  Obradovic,et al.  Predicting Protein Disorder for N-, C-, and Internal Regions. , 1999, Genome informatics. Workshop on Genome Informatics.

[48]  Obradovic,et al.  Predicting Binding Regions within Disordered Proteins. , 1999, Genome informatics. Workshop on Genome Informatics.

[49]  A. Gingras,et al.  Cap-dependent translation initiation in eukaryotes is regulated by a molecular mimic of eIF4G. , 1999, Molecular cell.

[50]  G. Wagner,et al.  The Cap-binding Protein eIF4E Promotes Folding of a Functional Domain of Yeast Translation Initiation Factor eIF4G1* , 1999, The Journal of Biological Chemistry.

[51]  C. Prives,et al.  The p53 pathway , 1999, The Journal of pathology.

[52]  Poul Nissen,et al.  Placement of protein and RNA structures into a 5 Å-resolution map of the 50S ribosomal subunit , 1999, Nature.

[53]  H. Dyson,et al.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. , 1999, Journal of molecular biology.

[54]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[55]  Benjamin A. Shoemaker,et al.  Speeding molecular recognition by using the folding funnel: the fly-casting mechanism. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Kyou-Hoon Han,et al.  Local Structural Elements in the Mostly Unstructured Transcriptional Activation Domain of Human p53* , 2000, The Journal of Biological Chemistry.

[57]  A. Petros,et al.  Rationale for Bcl‐XL/Bad peptide complex formation from structure, mutagenesis, and biophysical studies , 2000, Protein science : a publication of the Protein Society.

[58]  W. Weis,et al.  Structural basis of the Axin–adenomatous polyposis coli interaction , 2000, The EMBO journal.

[59]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[60]  V. Uversky Intrinsically Disordered Proteins , 2000 .

[61]  C. Brown,et al.  Intrinsic protein disorder in complete genomes. , 2000, Genome informatics. Workshop on Genome Informatics.

[62]  H. Langen,et al.  Mass spectrometry: A tool for the identification of proteins separated by gels , 2000, Electrophoresis.

[63]  V. Uversky,et al.  Why are “natively unfolded” proteins unstructured under physiologic conditions? , 2000, Proteins.

[64]  David J. Weber,et al.  Structure of the negative regulatory domain of p53 bound to S100B(ββ) , 2000, Nature Structural Biology.

[65]  A G Cochran,et al.  Antagonists of protein-protein interactions. , 2000, Chemistry & biology.

[66]  Yan Zhang,et al.  The bacterial cell‐division protein ZipA and its interaction with an FtsZ fragment revealed by X‐ray crystallography , 2001, The EMBO journal.

[67]  A. Demchenko,et al.  Recognition between flexible protein molecules: induced and assisted folding † , 2001, Journal of molecular recognition : JMR.

[68]  P. Romero,et al.  Sequence complexity of disordered protein , 2001, Proteins.

[69]  W. Weis,et al.  Molecular mechanisms of β‐catenin recognition by adenomatous polyposis coli revealed by the structure of an APC–β‐catenin complex , 2001, The EMBO journal.

[70]  K. Namba Roles of partly unfolded conformations in macromolecular self‐assembly , 2001, Genes to cells : devoted to molecular & cellular mechanisms.

[71]  B. Rost,et al.  Comparing function and structure between entire proteomes , 2001, Protein science : a publication of the Protein Society.

[72]  Gregory R. Grant,et al.  Statistical Methods in Bioinformatics , 2001 .

[73]  H. Dyson,et al.  Coupling of folding and binding for unstructured proteins. , 2002, Current opinion in structural biology.

[74]  R. Raines,et al.  Staudinger Ligation of α-Azido Acids Retains Stereochemistry , 2002 .

[75]  V. Uversky Natively unfolded proteins: A point where biology waits for physics , 2002, Protein science : a publication of the Protein Society.

[76]  P. Tompa Intrinsically unstructured proteins. , 2002, Trends in biochemical sciences.

[77]  Vladimir N Uversky,et al.  What does it mean to be natively unfolded? , 2002, European journal of biochemistry.

[78]  A. Valencia,et al.  Prediction of protein--protein interaction sites in heterocomplexes with neural networks. , 2002, European journal of biochemistry.

[79]  L. Iakoucheva,et al.  Intrinsic disorder in cell-signaling and cancer-associated proteins. , 2002, Journal of molecular biology.

[80]  L. Iakoucheva,et al.  Intrinsic Disorder and Protein Function , 2002 .

[81]  A. Valencia,et al.  Computational methods for the prediction of protein interactions. , 2002, Current opinion in structural biology.

[82]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[83]  S. Sharma,et al.  Protein-protein interactions: lessons learned. , 2002, Current medicinal chemistry. Anti-cancer agents.

[84]  V. Uversky,et al.  Protein folding revisited. A polypeptide chain at the folding – misfolding – nonfolding cross-roads: which way to go? , 2003, Cellular and Molecular Life Sciences CMLS.

[85]  Johannes Buchner,et al.  The N-terminal domain of p53 is natively unfolded. , 2003, Journal of molecular biology.

[86]  Sonia Longhi,et al.  The C-terminal Domain of the Measles Virus Nucleoprotein Is Intrinsically Disordered and Folds upon Binding to the C-terminal Moiety of the Phosphoprotein* , 2003, The Journal of Biological Chemistry.

[87]  R. Nussinov,et al.  Extended disordered proteins: targeting function with less scaffold. , 2003, Trends in biochemical sciences.

[88]  S. Vucetic,et al.  Flavors of protein disorder , 2003, Proteins.

[89]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[90]  István Simon,et al.  Preformed structural elements feature in partner recognition by intrinsically unstructured proteins. , 2004, Journal of molecular biology.

[91]  Carol V Robinson,et al.  Studies of the RNA degradosome-organizing domain of the Escherichia coli ribonuclease RNase E. , 2004, Journal of molecular biology.

[92]  Sonia Longhi,et al.  The C-terminal domain of measles virus nucleoprotein belongs to the class of intrinsically disordered proteins that fold upon binding to their physiological partner. , 2004, Virus research.

[93]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[94]  Brian W Matthews,et al.  Structural basis for the attachment of a paramyxoviral polymerase to its template. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[95]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[96]  Marc S. Cortese,et al.  Comparing and combining predictors of mostly disordered proteins. , 2005, Biochemistry.