Probing binding hot spots at protein–RNA recognition sites

We use evolutionary conservation derived from structure alignment of polypeptide sequences along with structural and physicochemical attributes of protein–RNA interfaces to probe the binding hot spots at protein–RNA recognition sites. We find that the degree of conservation varies across the RNA binding proteins; some evolve rapidly compared to others. Additionally, irrespective of the structural class of the complexes, residues at the RNA binding sites are evolutionary better conserved than those at the solvent exposed surfaces. For recognitions involving duplex RNA, residues interacting with the major groove are better conserved than those interacting with the minor groove. We identify multi-interface residues participating simultaneously in protein–protein and protein–RNA interfaces in complexes where more than one polypeptide is involved in RNA recognition, and show that they are better conserved compared to any other RNA binding residues. We find that the residues at water preservation site are better conserved than those at hydrated or at dehydrated sites. Finally, we develop a Random Forests model using structural and physicochemical attributes for predicting binding hot spots. The model accurately predicts 80% of the instances of experimental ΔΔG values in a particular class, and provides a stepping-stone towards the engineering of protein–RNA recognition sites with desired affinity.

[1]  T. Clackson,et al.  A hot spot of binding energy in a hormone-receptor interface , 1995, Science.

[2]  W. Delano Unraveling hot spots in binding interfaces: progress and challenges. , 2002, Current opinion in structural biology.

[3]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments , 1993, Nucleic Acids Res..

[4]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[5]  R. Bahadur,et al.  Hydration of protein–RNA recognition sites , 2014, Nucleic acids research.

[6]  J. Thornton,et al.  Protein–protein interfaces: Analysis of amino acid conservation in homodimers , 2001, Proteins.

[7]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[8]  Daniel R. Caffrey,et al.  Are protein–protein interfaces more conserved in sequence than the rest of the protein surface? , 2004, Protein science : a publication of the Protein Society.

[9]  Claude E. Shannon,et al.  A mathematical theory of communication , 1948, MOCO.

[10]  S. Gerstberger,et al.  A census of human RNA-binding proteins , 2014, Nature Reviews Genetics.

[11]  R. Nussinov,et al.  Protein–protein interactions: Structurally conserved residues distinguish between binding sites and exposed protein surfaces , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  K Henrick,et al.  Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. , 2004, Acta crystallographica. Section D, Biological crystallography.

[13]  Jennifer A. Doudna,et al.  A universal mode of helix packing in RNA , 2001, Nature Structural Biology.

[14]  Amita Barik,et al.  Molecular architecture of protein-RNA recognition sites , 2015, Journal of biomolecular structure & dynamics.

[15]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[16]  Susan Jones,et al.  RNA-binding residues in sequence space: Conservation and interaction patterns , 2009, Comput. Biol. Chem..

[17]  R. Nussinov,et al.  Hydrogen bonds and salt bridges across protein-protein interfaces. , 1997, Protein engineering.

[18]  Piotr Setny,et al.  Protein-DNA docking with a coarse-grained force field , 2012, BMC Bioinformatics.

[19]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[20]  P. Chakrabarti,et al.  Conservation and relative importance of residues across protein-protein interfaces , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  D. Baker,et al.  A simple physical model for binding energy hot spots in protein–protein complexes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Y. Shamoo,et al.  Structure-based analysis of protein-RNA interactions using the program ENTANGLE. , 2001, Journal of molecular biology.

[23]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[24]  J. Janin A minimal model of protein–protein binding affinities , 2014, Protein science : a publication of the Protein Society.

[25]  Joël Janin,et al.  Residue conservation in viral capsid assembly , 2008, Proteins.

[26]  R. Nussinov,et al.  Protein–protein interactions: organization, cooperativity and mapping in a bottom-up Systems Biology approach , 2005, Physical biology.

[27]  M. Sattler,et al.  Dynamics in multi-domain protein recognition of RNA. , 2012, Current opinion in structural biology.

[28]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29]  A. Bogan,et al.  Anatomy of hot spots in protein interfaces. , 1998, Journal of molecular biology.

[30]  A. Elcock,et al.  Identification of protein oligomerization states by analysis of interface conservation , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Z. Weng,et al.  Structure, function, and evolution of transient and obligate protein-protein interactions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[33]  Pinak Chakrabarti,et al.  Hydration of protein–protein interfaces , 2005, Proteins.

[34]  J. Thornton,et al.  Satisfying hydrogen bonding potential in proteins. , 1994, Journal of molecular biology.

[35]  J. Janin,et al.  Dissecting protein–RNA recognition sites , 2008, Nucleic acids research.

[36]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[37]  O. Lichtarge,et al.  Evolutionary predictions of binding surfaces and interactions. , 2002, Current opinion in structural biology.