Connecting the Sequence-Space of Bacterial Signaling Proteins to Phenotypes Using Coevolutionary Landscapes

Two-component signaling (TCS) is the primary means by which bacteria sense and respond to the environment. TCS involves two partner proteins working in tandem, which interact to perform cellular functions while limiting interactions with non-partners (i.e., “cross-talk”). We construct a Potts model for TCS that can quantitatively predict how mutating amino acid identities affect the interaction between TCS partners and non-partners. The parameters of this model are inferred directly from protein sequence data. This approach drastically reduces the computational complexity of exploring the sequence-space of TCS proteins. As a stringent test, we compare its predictions to a recent comprehensive mutational study, which characterized the functionality of 204 mutational variants of the PhoQ kinase in Escherichia coli. We find that our best predictions accurately reproduce the amino acid combinations found in experiment, which enable functional signaling with its partner PhoP. These predictions demonstrate the evolutionary pressure to preserve the interaction between TCS partners as well as prevent unwanted “crosstalk”. Further, we calculate the mutational change in the binding affinity between PhoQ and PhoP, providing an estimate to the amount of destabilization needed to disrupt TCS.

[1]  J. Bloom Identification of positive selection in genes is greatly improved by using experimentally informed site-specific models , 2016, Biology Direct.

[2]  Y. Benenson,et al.  Synthetic biology of cell signaling , 2016, Natural Computing.

[3]  F. Morcos,et al.  Sequence co-evolutionary information is a natural partner to minimally-frustrated models of biomolecular dynamics , 2016, F1000Research.

[4]  Claus O. Wilke,et al.  Causes of evolutionary rate variation among protein sites , 2016, Nature Reviews Genetics.

[5]  Saurav Mallik,et al.  Predicting protein folding rate change upon point mutation using residue‐level coevolutionary information , 2016, Proteins.

[6]  Mohit Raghunathan,et al.  Constructing sequence‐dependent protein models using coevolutionary information , 2016, Protein science : a publication of the Protein Society.

[7]  Andrea Pagnani,et al.  Inter-Protein Sequence Co-Evolution Predicts Known Physical Interactions in Bacterial Ribosomes and the Trp Operon , 2015, PloS one.

[8]  A. Tek,et al.  MMB-GUI: a fast morphing method demonstrates a possible ribosomal tRNA translocation trajectory , 2015, Nucleic acids research.

[9]  Peter G Wolynes,et al.  Evolution, energy landscapes and the paradoxes of protein folding. , 2015, Biochimie.

[10]  A. Valencia,et al.  From residue coevolution to protein conformational ensembles and functional dynamics , 2015, Proceedings of the National Academy of Sciences.

[11]  Thomas A. Hopf,et al.  Quantification of the effect of mutations using a global probability model of natural sequence variation , 2015, 1510.04612.

[12]  M. Weigt,et al.  Coevolutionary Landscape Inference and the Context-Dependence of Mutations in Beta-Lactamase TEM-1 , 2015, bioRxiv.

[13]  M. Laub,et al.  Evolving New Protein-Protein Interaction Specificity through Promiscuous Intermediates , 2015, Cell.

[14]  Ricardo N Dos Santos,et al.  Dimeric interactions and complex formation using direct coevolutionary couplings , 2015, Scientific Reports.

[15]  Daniel F. A. R. Dourado,et al.  Structural and Functional Impact of Parkinson Disease‐Associated Mutations in the E3 Ubiquitin Ligase Parkin , 2015, Human mutation.

[16]  G Tiana,et al.  A many-body term improves the accuracy of effective potentials based on protein coevolutionary data. , 2015, The Journal of chemical physics.

[17]  Simone Marsili,et al.  Large-Scale Conformational Transitions and Dimerization Are Encoded in the Amino-Acid Sequences of Hsp70 Chaperones , 2015, PLoS Comput. Biol..

[18]  Michael T. Laub,et al.  Pervasive degeneracy and epistasis in a protein-protein interface , 2015, Science.

[19]  Stephanie J. Spielman,et al.  The relationship between dN/dS and scaled selection coefficients. , 2015, Molecular biology and evolution.

[20]  H. Chan,et al.  Biophysics of protein evolution and evolutionary protein biophysics , 2014, Journal of The Royal Society Interface.

[21]  Samuel Flores,et al.  Phosphorylation by PINK1 Releases the UBL Domain and Initializes the Conformational Opening of the E3 Ubiquitin Ligase Parkin , 2014, PLoS Comput. Biol..

[22]  Jeffrey J. Tabor,et al.  Refactoring and optimization of light-switchable Escherichia coli two-component systems. , 2014, ACS synthetic biology.

[23]  Daniel F. A. R. Dourado,et al.  A multiscale approach to predicting affinity changes in protein–protein interfaces , 2014, Proteins.

[24]  Peter G Wolynes,et al.  Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection , 2014, Proceedings of the National Academy of Sciences.

[25]  Thierry Mora,et al.  Capturing coevolutionary signals inrepeat proteins , 2014, BMC Bioinformatics.

[26]  José N. Onuchic,et al.  Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information , 2014, Proceedings of the National Academy of Sciences.

[27]  Pedro M. Alzari,et al.  Segmental Helical Motions and Dynamical Asymmetry Modulate Histidine Kinase Autophosphorylation , 2014, PLoS biology.

[28]  Peter G Wolynes,et al.  Frustration in biomolecules , 2013, Quarterly Reviews of Biophysics.

[29]  Terence Hwa,et al.  Coevolutionary signals across protein lineages help capture multiple protein conformations , 2013, Proceedings of the National Academy of Sciences.

[30]  Soon Ho Hong,et al.  Engineered fumarate sensing Escherichia coli based on novel chimeric two-component system. , 2013, Journal of biotechnology.

[31]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[32]  Samuel Coulbourn Flores,et al.  Fast fitting to low resolution density maps: elucidating large-scale motions of the ribosome , 2013, Nucleic acids research.

[33]  K. Dill,et al.  Principles of maximum entropy and maximum caliber in statistical physics , 2013 .

[34]  Guido Tiana,et al.  The network of stabilizing contacts in proteins studied by coevolutionary data. , 2013, The Journal of chemical physics.

[35]  Michael T. Laub,et al.  Determinants of specificity in two-component signal transduction. , 2013, Current opinion in microbiology.

[36]  A. Valencia,et al.  Emerging methods in protein co-evolution , 2013, Nature Reviews Genetics.

[37]  Andrew L. Ferguson,et al.  Translating HIV sequences into quantitative fitness landscapes predicts viral vulnerabilities for rational immunogen design. , 2013, Immunity.

[38]  E. Aurell,et al.  Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Thomas A. Hopf,et al.  Protein structure prediction from sequence variation , 2012, Nature Biotechnology.

[40]  Adam P. Arkin,et al.  Engineering robust control of two-component system phosphotransfer using modular scaffolds , 2012, Proceedings of the National Academy of Sciences.

[41]  Michael T Laub,et al.  Evolution of two-component signal transduction systems. , 2012, Annual review of microbiology.

[42]  F. Morcos,et al.  Genomics-aided structure prediction , 2012, Proceedings of the National Academy of Sciences.

[43]  Martin Weigt,et al.  Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis , 2012, Proceedings of the National Academy of Sciences.

[44]  Richard A. Goldstein,et al.  Estimating the Distribution of Selection Coefficients from Phylogenetic Data Using Sitewise Mutation-Selection Models , 2012, Genetics.

[45]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[46]  Russ B. Altman,et al.  Fast Flexible Modeling of RNA Structure Using Internal Coordinates , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[47]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[48]  B. Lunt,et al.  Dissecting the Specificity of Protein-Protein Interaction in Bacterial Two-Component Signaling: Orphans and Crosstalks , 2011, PloS one.

[49]  Robert D. Finn,et al.  Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation , 2011, PloS one.

[50]  Christopher A. Voigt,et al.  Multichromatic control of gene expression in Escherichia coli. , 2011, Journal of molecular biology.

[51]  Russ B. Altman,et al.  Pacific Symposium on Biocomputing 15:216-227(2010) PREDICTING RNA STRUCTURE BY MULTIPLE TEMPLATE HOMOLOGY MODELING , 2022 .

[52]  V. Rubio,et al.  The mechanism of signal transduction by two-component systems. , 2010, Current opinion in structural biology.

[53]  Jeffrey M. Skerker,et al.  Systematic Dissection and Trajectory-Scanning Mutagenesis of the Molecular Interface That Ensures Specificity of Two-Component Signaling Pathways , 2010, PLoS genetics.

[54]  Russ B Altman,et al.  Turning limited experimental information into 3D models of RNA. , 2010, RNA.

[55]  Hendrik Szurmant,et al.  Interaction fidelity in two-component signaling. , 2010, Current opinion in microbiology.

[56]  R. Bourret,et al.  Two-component signal transduction. , 2010, Current opinion in microbiology.

[57]  Terence Hwa,et al.  High-resolution protein complexes from integrating genomic information with molecular simulation , 2009, Proceedings of the National Academy of Sciences.

[58]  Alberto Marina,et al.  Structural Insight into Partner Specificity and Phosphoryl Transfer in Two-Component Signal Transduction , 2009, Cell.

[59]  V. Pande,et al.  On the application of statistical physics to evolutionary biology. , 2009, Journal of theoretical biology.

[60]  T. Hwa,et al.  Identification of direct residue contacts in protein–protein interaction by message passing , 2009, Proceedings of the National Academy of Sciences.

[61]  Michael T. Laub,et al.  Rewiring the Specificity of Two-Component Signal Transduction Systems , 2008, Cell.

[62]  E. van Nimwegen,et al.  Accurate Prediction of Protein–protein Interactions from Sequence Alignments Using a Bayesian Method , 2022 .

[63]  M. Laub,et al.  Specificity in two-component signal transduction pathways. , 2007, Annual review of genetics.

[64]  Michael T Laub,et al.  Two-Component Signal Transduction Pathways Regulating Growth and Cell Cycle Progression in a Bacterium: A System-Level Analysis , 2005, PLoS biology.

[65]  R. Utsumi,et al.  Functional Characterization in Vitro of All Two-component Signal Transduction Systems from Escherichia coli* , 2005, Journal of Biological Chemistry.

[66]  Wendell A. Lim,et al.  Optimization of specificity in a cellular protein interaction network by negative selection , 2003, Nature.

[67]  M. Inouye,et al.  Cysteine-Scanning Analysis of the Dimerization Domain of EnvZ, an Osmosensing Histidine Kinase , 2003, Journal of bacteriology.

[68]  Eugene I Shakhnovich,et al.  Amino acids determining enzyme-substrate specificity in prokaryotic and eukaryotic protein kinases , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[69]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[70]  J. Hoch,et al.  Two-component and phosphorelay signal transduction. , 2000, Current opinion in microbiology.

[71]  A. Halpern,et al.  Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. , 1998, Molecular biology and evolution.

[72]  V S Pande,et al.  Statistical mechanics of simple models of protein folding and design. , 1997, Biophysical journal.

[73]  J. Hoch,et al.  Molecular recognition in signal transduction: the interaction surfaces of the Spo0F response regulator with its cognate phosphorelay proteins revealed by alanine scanning mutagenesis. , 1997, Journal of molecular biology.

[74]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[75]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[76]  C. Sander,et al.  Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? , 1994, Protein engineering.

[77]  E. Neher How frequent are correlated changes in families of protein sequences? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[78]  J. Onuchic,et al.  Protein folding funnels: a kinetic approach to the sequence-structure relationship. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[79]  P. Wolynes,et al.  Spin glasses and the statistical mechanics of protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[80]  Vijay S. Pande,et al.  Heteropolymer freezing and design: Towards physical models of protein folding , 2000 .

[81]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[82]  C. Sander,et al.  Correlated Mutations and Residue Contacts , 1994 .