Consensus sequence design as a general strategy to create hyperstable, biologically active proteins

Consensus sequence design offers a promising strategy for designing proteins of high stability while retaining biological activity since it draws upon an evolutionary history in which residues important for both stability and function are likely to be conserved. Although there have been several reports of successful consensus design of individual targets, it is unclear from these anecdotal studies how often this approach succeeds, and how often it fails. Here, we attempt to assess generality by designing consensus sequences for a set of six protein families with a range of chain-lengths, structures, and activities. We characterize the resulting consensus proteins for stability, structure, and biological activities in an unbiased way. We find that all six consensus proteins adopt cooperatively folded structures in solution. Strikingly, four out of six of these consensus proteins show increased thermodynamic stability over naturally-occurring homologues. Each consensus protein tested for function maintained at least partial biological activity. Though peptide binding affinity by a consensus-designed SH3 is rather low, Km values for consensus enzymes are similar to values from extant homologues. Though consensus enzymes are slower than extant homologues at low temperature, they are faster than some thermophilic enzymes at high temperature. An analysis of sequence properties shows consensus proteins to be enriched in charged residues, and rarified in uncharged polar residues. Sequence differences between consensus and extant homologues are predominantly located at weakly conserved surface residues, highlighting the importance of these residues in the success of the consensus strategy. Significance Statement A major goal of protein design is to create proteins that have high stability and biological activity. Drawing on evolutionary information encoded within extant protein sequences, consensus sequence design has produced several successes in achieving this goal. Here we explore the generality with which consensus design can be used to enhance protein stability and maintain biological activity. By designing and characterizing consensus sequences for six unrelated protein families, we find that consensus design shows high success rates in creating well-folded, hyperstable proteins that retain biological activities. Remarkably, many of these consensus proteins show higher stabilities than naturally-occurring sequences of their respective protein families. Our study highlights the utility of consensus sequence design and informs the mechanisms by which it works.

[1]  C. Pace,et al.  Denaturant m values and heat capacity changes: Relation to changes in accessible surface areas of protein unfolding , 1995, Protein science : a publication of the Protein Society.

[2]  R. Pain,et al.  Relation between stability, dynamics and enzyme activity in 3-phosphoglycerate kinases from yeast and Thermus thermophilus. , 1991, Journal of molecular biology.

[3]  Martin Lehmann,et al.  The consensus concept for thermostability engineering of proteins: further proof of concept. , 2002, Protein engineering.

[4]  M. Wolf-Watz,et al.  Realtime (31)P NMR Investigation on the Catalytic Behavior of the Enzyme Adenylate kinase in the Matrix of a Switchable Ionic Liquid. , 2015, ChemSusChem.

[5]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[6]  R. Nussinov,et al.  Factors enhancing protein thermostability. , 2000, Protein engineering.

[7]  S. Benkovic,et al.  Construction and evaluation of the kinetic scheme associated with dihydrofolate reductase from Escherichia coli. , 1987, Biochemistry.

[8]  Andreas Plückthun,et al.  Designed armadillo repeat proteins as general peptide-binding scaffolds: consensus design and computational optimization of the hydrophobic core. , 2008, Journal of molecular biology.

[9]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[10]  Jennifer E Gagner,et al.  Designing protein-based biomaterials for medical applications. , 2014, Acta biomaterialia.

[11]  F. Arnold,et al.  Directed evolution of a thermostable esterase. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  D. Hilvert,et al.  Protein design by directed evolution. , 2008, Annual review of biophysics.

[13]  S. Grzesiek,et al.  NMRPipe: A multidimensional spectral processing system based on UNIX pipes , 1995, Journal of biomolecular NMR.

[14]  A. Rath,et al.  The design of a hyperstable mutant of the Abp1p SH3 domain by sequence alignment analysis , 2000, Protein science : a publication of the Protein Society.

[15]  Milton W. Taylor,et al.  The biologic activity and molecular characterization of a novel synthetic interferon-alpha species, consensus interferon. , 1996, Journal of interferon & cytokine research : the official journal of the International Society for Interferon and Cytokine Research.

[16]  S. Steinbacher,et al.  Sequence statistics reliably predict stabilizing mutations in a protein domain. , 1994, Journal of molecular biology.

[17]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[18]  Geoffrey I. Webb,et al.  Smoothing a rugged protein folding landscape by sequence-based redesign , 2016, Scientific Reports.

[19]  Yang Shen,et al.  Protein structural information derived from NMR chemical shift with the neural network program TALOS-N. , 2015, Methods in molecular biology.

[20]  Adam J. Stevens,et al.  Design of a Split Intein with Exceptional Protein Splicing Activity , 2016, Journal of the American Chemical Society.

[21]  Amy C. Anderson,et al.  Computational structure-based redesign of enzyme activity , 2009, Proceedings of the National Academy of Sciences.

[22]  Peer Bork,et al.  SMART: recent updates, new developments and status in 2015 , 2014, Nucleic Acids Res..

[23]  Gilad Haran,et al.  Direct observation of ultrafast large-scale dynamics of an enzyme under turnover conditions , 2018, Proceedings of the National Academy of Sciences.

[24]  L Serrano,et al.  Thermodynamic and kinetic analysis of the SH3 domain of spectrin shows a two-state folding transition. , 1994, Biochemistry.

[25]  D. M. Taverna,et al.  Why are proteins marginally stable? , 2002, Proteins.

[26]  A. Buckle,et al.  Circumventing the stability-function trade-off in an engineered FN3 domain , 2016, Protein engineering, design & selection : PEDS.

[27]  Venuka Durani,et al.  Triosephosphate isomerase by consensus design: dramatic differences in physical properties and activity of related variants. , 2011, Journal of molecular biology.

[28]  Eric A. Althoff,et al.  Kemp elimination catalysts by computational enzyme design , 2008, Nature.

[29]  M. Williamson Using chemical shift perturbation to characterise ligand binding. , 2013, Progress in nuclear magnetic resonance spectroscopy.

[30]  F. Arnold,et al.  Directed evolution converts subtilisin E into a functional equivalent of thermitase. , 1999, Protein engineering.

[31]  K. Lindorff-Larsen,et al.  BPPred: A Web‐based computational tool for predicting biophysical parameters of proteins , 2006, Protein science : a publication of the Protein Society.

[32]  J. Klinman,et al.  Structure and hydride transfer mechanism of a moderate thermophilic dihydrofolate reductase from Bacillus stearothermophilus and comparison to its mesophilic and hyperthermophilic homologues. , 2005, Biochemistry.

[33]  Benjamin T. Porebski,et al.  Consensus protein design , 2016, Protein engineering, design & selection : PEDS.

[34]  Yong Xiong,et al.  Design of stable alpha-helical arrays from an idealized TPR motif. , 2003, Structure.

[35]  Donald Hilvert,et al.  De novo enzymes by computational design. , 2013, Current opinion in chemical biology.

[36]  A. Fersht,et al.  Estimating the contribution of engineered surface electrostatic interactions to protein stability by using double-mutant cycles. , 1990, Biochemistry.

[37]  P. Permi,et al.  SH3 domain ligand binding: What's the consensus and where's the specificity? , 2012, FEBS letters.

[38]  George I Makhatadze,et al.  Protein stability and surface electrostatics: a charged relationship. , 2006, Biochemistry.

[39]  U. Günther,et al.  The Role of Large-Scale Motions in Catalysis by Dihydrofolate Reductase , 2011, Journal of the American Chemical Society.

[40]  Pablo Gainza,et al.  Algorithms for protein design. , 2016, Current opinion in structural biology.

[41]  J. Lamotte‐Brasseur,et al.  Structural, Kinetic, and Calorimetric Characterization of the Cold-active Phosphoglycerate Kinase from the AntarcticPseudomonas sp. TACII18* , 2000, The Journal of Biological Chemistry.

[42]  Bin Zhou,et al.  The design and recombinant protein expression of a consensus porcine interferon: CoPoIFN-α. , 2012, Cytokine.

[43]  Woonghee Lee,et al.  NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy , 2014, Bioinform..

[44]  Nidhi Mathur,et al.  Computational approaches for predicting mutant protein stability , 2016, Journal of Computer-Aided Molecular Design.

[45]  Karen M Polizzi,et al.  Structure-guided consensus approach to create a more thermostable penicillin G acylase. , 2006, Biotechnology journal.

[46]  A. Fersht,et al.  Semirational design of active tumor suppressor p53 DNA binding domain with enhanced stability. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Venuka Durani,et al.  Stabilizing proteins from sequence statistics: the interplay of conservation and correlation in triosephosphate isomerase stability. , 2012, Journal of molecular biology.

[48]  L. Swint-Kruse,et al.  Rheostats and Toggle Switches for Modulating Protein Function , 2013, PloS one.

[49]  Jian Tian,et al.  Thermal Stabilization of Dihydrofolate Reductase Using Monte Carlo Unfolding Simulations and Its Functional Consequences , 2015, PLoS Comput. Biol..

[50]  Mona Singh,et al.  Predicting functionally important residues from sequence conservation , 2007, Bioinform..

[51]  M. Go,et al.  Ancient divergence of long and short isoforms of adenylate kinase molecular evolution of the nucleoside monophosphate kinase family , 1996, FEBS letters.

[52]  Y. Nozaki The preparation of guanidine hydrochloride. , 1972, Methods in enzymology.

[53]  Michael J. Harms,et al.  Molecular ensembles make evolution unpredictable , 2017, Proceedings of the National Academy of Sciences.

[54]  C. Wilke,et al.  Thermodynamics of Neutral Protein Evolution , 2006, Genetics.

[55]  D. Barrick,et al.  Creating a Homeodomain with High Stability and DNA Binding Affinity by Sequence Averaging. , 2017, Journal of the American Chemical Society.

[56]  R. Jaenicke,et al.  Phosphoglycerate kinase and triosephosphate isomerase from the hyperthermophilic bacterium Thermotoga maritima form a covalent bifunctional enzyme complex. , 1995, The EMBO journal.

[57]  Yaoqi Zhou,et al.  Energy functions in de novo protein design: current challenges and future prospects. , 2013, Annual review of biophysics.

[58]  A. Mittermaier,et al.  Binding mechanism of an SH3 domain studied by NMR and ITC. , 2009, Journal of the American Chemical Society.

[59]  G. Somero,et al.  Evolution of lactate dehydrogenase-A homologs of barracuda fishes (genus Sphyraena) from different thermal environments: differences in kinetic properties and thermal stability are due to amino acid substitutions outside the active site. , 1997, Biochemistry.

[60]  L. Segovia,et al.  Cofactor specificity switch in Shikimate dehydrogenase by rational design and consensus engineering , 2017, Protein engineering, design & selection : PEDS.

[61]  W. P. Russ,et al.  Evolutionary information for specifying a protein fold , 2005, Nature.

[62]  Gerard Talavera,et al.  Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. , 2007, Systematic biology.

[63]  Silvio C. E. Tosatto,et al.  InterPro in 2017—beyond protein family and domain annotations , 2016, Nucleic Acids Res..

[64]  George A. Khoury,et al.  Protein folding and de novo protein design for biotechnological applications. , 2014, Trends in biotechnology.

[65]  S. Bouvier,et al.  Systematic mutation of bacteriophage T4 lysozyme. , 1991, Journal of molecular biology.

[66]  Georgia Hadjipavlou,et al.  Linkage between dynamics and catalysis in a thermophilic-mesophilic enzyme pair , 2004, Nature Structural &Molecular Biology.

[67]  Doug Barrick,et al.  The contribution of entropy, enthalpy, and hydrophobic desolvation to cooperativity in repeat-protein folding. , 2011, Structure.

[68]  George I. Makhatadze,et al.  Rational stabilization of enzymes by computational redesign of surface charge–charge interactions , 2009, Proceedings of the National Academy of Sciences.

[69]  C. Orengo,et al.  Stability-activity tradeoffs constrain the adaptive evolution of RubisCO , 2014, Proceedings of the National Academy of Sciences.

[70]  Joost Schymkowitz,et al.  The stability effects of protein mutations appear to be universally distributed. , 2007, Journal of molecular biology.

[71]  B. Erman,et al.  Information‐theoretical entropy as a measure of sequence variability , 1991, Proteins.

[72]  Burckhard Seelig,et al.  Advances in the directed evolution of proteins. , 2014, Current opinion in chemical biology.

[73]  Z. Peng,et al.  Consensus-derived structural determinants of the ankyrin repeat motif , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[74]  David R. Liu,et al.  Supercharging proteins can impart unusual resilience. , 2007, Journal of the American Chemical Society.

[75]  Adam Godzik,et al.  Divergent evolution of protein conformational dynamics in dihydrofolate reductase , 2013, Nature Structural &Molecular Biology.

[76]  G. Vriend,et al.  Consensus engineering of sucrose phosphorylase: The outcome reflects the sequence input , 2013, Biotechnology and bioengineering.

[77]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[78]  J R Desjarlais,et al.  Use of a zinc-finger consensus sequence framework and specificity rules to design specific DNA binding proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[79]  C. Pace Determination and analysis of urea and guanidine hydrochloride denaturation curves. , 1986, Methods in enzymology.

[80]  R A Goldstein,et al.  Mutation matrices and physical‐chemical properties: Correlations and implications , 1997, Proteins.

[81]  R. Sauer,et al.  Combinatorial cassette mutagenesis as a probe of the informational content of protein sequences. , 1988, Science.

[82]  Sonia Jemli,et al.  Biocatalysts: application and engineering for industrial purposes , 2016, Critical reviews in biotechnology.

[83]  T. Magliery,et al.  Phylogenetic spread of sequence data affects fitness of SOD1 consensus enzymes: Insights from sequence statistics and structural analyses , 2018, Proteins.

[84]  Jianwen Fang,et al.  A novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting relative thermostability of protein mutants , 2010, BMC Bioinformatics.