Trade-offs between enzyme fitness and solubility illuminated by deep mutational scanning

Significance Enzymes find utility as therapeutics and for the production of specialty chemicals. Changing the amino acid sequence of an enzyme can increase solubility, but many such mutations disrupt catalytic activity. To evaluate this trade-off, we developed an experimental system to evaluate the relative solubility for nearly all possible single point mutants for two model enzymes. We find that the tendency for a given solubility-enhancing mutation to disrupt catalytic activity depends, among other factors, on how far the position is from the catalytic active site and whether that mutation has been sampled during evolution. We develop predictive models to identify mutations that enhance solubility without disrupting activity with an accuracy of 90%. These results have biotechnological applications. Proteins are marginally stable, and an understanding of the sequence determinants for improved protein solubility is highly desired. For enzymes, it is well known that many mutations that increase protein solubility decrease catalytic activity. These competing effects frustrate efforts to design and engineer stable, active enzymes without laborious high-throughput activity screens. To address the trade-off between enzyme solubility and activity, we performed deep mutational scanning using two different screens/selections that purport to gauge protein solubility for two full-length enzymes. We assayed a TEM-1 beta-lactamase variant and levoglucosan kinase (LGK) using yeast surface display (YSD) screening and a twin-arginine translocation pathway selection. We then compared these scans with published experimental fitness landscapes. Results from the YSD screen could explain 37% of the variance in the fitness landscapes for one enzyme. Five percent to 10% of all single missense mutations improve solubility, matching theoretical predictions of global protein stability. For a given solubility-enhancing mutation, the probability that it would retain wild-type fitness was correlated with evolutionary conservation and distance to active site, and anticorrelated with contact number. Hybrid classification models were developed that could predict solubility-enhancing mutations that maintain wild-type fitness with an accuracy of 90%. The downside of using such classification models is the removal of rare mutations that improve both fitness and solubility. To reveal the biophysical basis of enhanced protein solubility and function, we determined the crystallographic structure of one such LGK mutant. Beyond fundamental insights into trade-offs between stability and activity, these results have potential biotechnological applications.

[1]  M. Lehmann,et al.  From DNA sequence to improved functionality: using protein sequence comparisons to rapidly design a thermostable consensus phytase. , 2000, Protein engineering.

[2]  Dan S. Tawfik,et al.  Intense neutral drifts yield robust and evolvable consensus proteins. , 2008, Journal of molecular biology.

[3]  M. DeLisa,et al.  Genetic selection of solubility-enhanced proteins using the twin-arginine translocation system. , 2011, Methods in molecular biology.

[4]  Xiaoran Fu Stowell,et al.  Limitations of yeast surface display in engineering proteins of high thermostability. , 2006, Protein engineering, design & selection : PEDS.

[5]  S. Steinbacher,et al.  Sequence statistics reliably predict stabilizing mutations in a protein domain. , 1994, Journal of molecular biology.

[6]  François Stricher,et al.  How Protein Stability and New Functions Trade Off , 2008, PLoS Comput. Biol..

[7]  Justin R Klesmith,et al.  Comprehensive Sequence-Flux Mapping of a Levoglucosan Utilization Pathway in E. coli. , 2015, ACS synthetic biology.

[8]  Keith E. J. Tyo,et al.  Plasmid-based one-pot saturation mutagenesis , 2016, Nature Methods.

[9]  K Dane Wittrup,et al.  Isolating and engineering human antibodies using yeast surface display , 2006, Nature Protocols.

[10]  D. Fowler,et al.  Deep mutational scanning: assessing protein function on a massive scale. , 2011, Trends in biotechnology.

[11]  G. Waldo,et al.  Directed evolution of an extremely stable fluorescent protein. , 2009, Protein engineering, design & selection : PEDS.

[12]  George Georgiou,et al.  The bacterial twin-arginine translocation pathway. , 2006, Annual review of microbiology.

[13]  Jesse D. Bloom,et al.  Software for the analysis and visualization of deep mutational scanning data , 2015, bioRxiv.

[14]  Adam C. Fisher,et al.  Genetic selection for protein solubility enabled by the folding quality control feature of the twin‐arginine translocation pathway , 2006, Protein science : a publication of the Protein Society.

[15]  Brian K Shoichet,et al.  Evolution of an antibiotic resistance enzyme constrained by stability and activity trade-offs. , 2002, Journal of molecular biology.

[16]  Jason T Boock,et al.  Repurposing a bacterial quality control mechanism to enhance enzyme production in living cells. , 2015, Journal of molecular biology.

[17]  Joost Schymkowitz,et al.  The stability effects of protein mutations appear to be universally distributed. , 2007, Journal of molecular biology.

[18]  Marek Michalak,et al.  Quality control in the endoplasmic reticulum. , 2010, Seminars in cell & developmental biology.

[19]  Timothy A. Whitehead,et al.  High-throughput evaluation of synthetic metabolic pathways. , 2016, Technology.

[20]  D. Baker,et al.  Role of conformational sampling in computing mutation‐induced changes in protein structure and stability , 2011, Proteins.

[21]  Kengo Kinoshita,et al.  Community-wide assessment of protein-interface modeling suggests improvements to design methodology. , 2011, Journal of molecular biology.

[22]  G. Waldo,et al.  Genetic screens and directed evolution for protein solubility. , 2003, Current opinion in chemical biology.

[23]  T. Mikkelsen,et al.  Comprehensive mutational scanning of a kinase in vivo reveals substrate-dependent fitness landscapes , 2014, Nucleic acids research.

[24]  Timothy A. Whitehead,et al.  High-Resolution Sequence-Function Mapping of Full-Length Proteins , 2015, PloS one.

[25]  Michele Vendruscolo,et al.  The CamSol method of rational design of protein mutants with enhanced solubility. , 2015, Journal of molecular biology.

[26]  Dan S. Tawfik,et al.  Stability effects of mutations and protein evolvability. , 2009, Current opinion in structural biology.

[27]  Jaime Prilusky,et al.  Automated Structure- and Sequence-Based Design of Proteins for High Bacterial Expression and Stability , 2016, Molecular cell.