Understanding the Origins of Loss of Protein Function by Analyzing the Effects of Thousands of Variants on Activity and Abundance

Understanding and predicting how amino acid substitutions affect proteins is key to practical uses of proteins, and to our basic understanding of protein function and evolution. Amino acid changes may affect protein function in a number of ways including direct perturbations of activity or indirect effects on protein folding and stability. We have analysed 6749 experimentally determined variant effects from multiplexed assays on abundance and activity in two proteins (NUDT15 and PTEN) to quantify these effects, and find that a third of the variants cause loss of function, and about half of loss-of-function variants also have low cellular abundance. We analyse the structural and mechanistic origins of loss of function, and use the experimental data to find residues important for enzymatic activity. We performed computational analyses of protein stability and evolutionary conservation and show how we may predict positions where variants cause loss of activity or abundance.

[1]  J. Fraser,et al.  Ensemble-based enzyme design can recapitulate the effects of laboratory directed evolution in silico , 2020, Nature Communications.

[2]  M. Lenzen,et al.  Scientists’ warning on affluence , 2020, Nature Communications.

[3]  Xin Liu,et al.  Identification of pathogenic missense mutations using protein stability predictors , 2020, Scientific Reports.

[4]  Alistair S Dunham,et al.  Exploring amino acid functions in a deep mutational landscape , 2020, bioRxiv.

[5]  Kenneth A. Matreyek,et al.  Multiplexed measurement of variant abundance and activity reveals VKOR topology, active site and human variant impact , 2020, bioRxiv.

[6]  Andrea Pagnani,et al.  Unsupervised Inference of Protein Fitness Landscape from Deep Mutational Scan , 2020, bioRxiv.

[7]  Kenneth A. Matreyek,et al.  Massively parallel variant characterization identifies NUDT15 alleles associated with thiopurine toxicity , 2020, Proceedings of the National Academy of Sciences.

[8]  Burkhard Rost,et al.  Variant effect predictions capture some aspects of deep mutational scanning experiments , 2019, BMC Bioinformatics.

[9]  Joseph A. Marsh,et al.  Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations , 2019, bioRxiv.

[10]  David M. McCandlish,et al.  Annual Review of Genomics and Human Genetics Massively Parallel Assays and Quantitative Sequence – Function Relationships , 2019 .

[11]  M. dal Peraro,et al.  Active site-induced evolutionary constraints follow fold polarity principles in soluble globular enzymes. , 2019, Molecular biology and evolution.

[12]  Johannes L. Schönberger,et al.  SciPy 1.0: fundamental algorithms for scientific computing in Python , 2019, Nature Methods.

[13]  Michael Gruenstaeudl,et al.  PACVr: plastome assembly coverage visualization in R , 2019, BMC Bioinformatics.

[14]  K. Lindorff-Larsen,et al.  Classifying disease-associated variants using measures of protein activity and stability , 2019, bioRxiv.

[15]  K. Lindorff-Larsen,et al.  Biophysical and Mechanistic Models for Disease-Causing Protein Variants. , 2019, Trends in biochemical sciences.

[16]  C. Landry,et al.  Perturbing proteomes at single residue resolution using base editing , 2019, Nature Communications.

[17]  K. Lindorff-Larsen,et al.  Computational and cellular studies reveal structural destabilization and degradation of MLH1 variants in Lynch syndrome , 2019, bioRxiv.

[18]  K. Lindorff-Larsen,et al.  Toward mechanistic models for genotype–phenotype correlations in phenylketonuria using protein stability calculations , 2019, Human mutation.

[19]  J. Echave Beyond Stability Constraints: A Biophysical Model of Enzyme Evolution with Selection on Stability and Activity , 2019, Molecular biology and evolution.

[20]  M. Relling,et al.  Clinical Pharmacogenetics Implementation Consortium Guideline for Thiopurine Dosing Based on TPMT and NUDT15 Genotypes: 2018 Update , 2019, Clinical pharmacology and therapeutics.

[21]  C. Eng,et al.  PTEN-opathies: from biological insights to evidence-based precision medicine. , 2019, The Journal of clinical investigation.

[22]  Omar Wagih,et al.  A resource of variant effect predictions of single nucleotide variants in model organisms , 2018, Molecular systems biology.

[23]  Frederick P. Roth,et al.  Multiplexed assays of variant effects contribute to a growing genotype–phenotype atlas , 2018, Human Genetics.

[24]  Sarel J Fleishman,et al.  Principles of Protein Stability and Their Application in Computational Design. , 2018, Annual review of biochemistry.

[25]  U. Hofmann,et al.  Preclinical evaluation of NUDT15-guided thiopurine therapy and its effects on toxicity and antileukemic efficacy. , 2018, Blood.

[26]  M. Jiménez,et al.  Substitution Rates Predicted by Stability‐Constrained Models of Protein Evolution Are Not Consistent with Empirical Data , 2018, Molecular biology and evolution.

[27]  Taylor L. Mighell,et al.  A saturation mutagenesis approach to understanding PTEN lipid phosphatase activity and genotype-phenotypes relationships , 2018, bioRxiv.

[28]  J. Tiihonen,et al.  Amygdala-orbitofrontal structural and functional connectivity in females with anxiety disorders, with and without a history of conduct disorder , 2018, Scientific Reports.

[29]  Vanessa E. Gray,et al.  Multiplex Assessment of Protein Variant Abundance by Massively Parallel Sequencing , 2018, Nature Genetics.

[30]  F. Pucci,et al.  Prediction and interpretation of deleterious coding variants in terms of protein structural stability , 2017, bioRxiv.

[31]  K. Lindorff-Larsen,et al.  Blocking protein quality control to counter hereditary cancers , 2017, Genes, chromosomes & cancer.

[32]  Vanessa E. Gray,et al.  Analysis of Large-Scale Mutagenesis Data To Assess the Impact of Single Amino Acid Substitutions , 2017, Genetics.

[33]  Motohiro Kato,et al.  The effects of inherited NUDT15 polymorphisms on thiopurine active metabolites in Japanese children with acute lymphoblastic leukemia , 2017, Pharmacogenetics and genomics.

[34]  Maher M. Kassem,et al.  Predicting the impact of Lynch syndrome-causing missense mutations from structural calculations , 2017, PLoS genetics.

[35]  P. Jiang,et al.  Enhanced Degradation of Misfolded Proteins Promotes Tumorigenesis. , 2017, Cell reports.

[36]  Thomas A. Hopf,et al.  Mutation effects predicted from sequence co-variation , 2017, Nature Biotechnology.

[37]  David E. Kim,et al.  Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules. , 2016, Journal of chemical theory and computation.

[38]  Benjamin T. Porebski,et al.  The role of protein dynamics in the evolution of new enzyme function. , 2016, Nature chemical biology.

[39]  Trevor Hastie,et al.  REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. , 2016, American journal of human genetics.

[40]  N. G. Sheppard,et al.  NUDT15 Hydrolyzes 6-Thio-DeoxyGTP to Mediate the Anticancer Efficacy of 6-Thioguanine. , 2016, Cancer research.

[41]  Dmitry Chudakov,et al.  Local fitness landscape of the green fluorescent protein , 2016, Nature.

[42]  Benjamin R. Jack,et al.  Functional Sites Induce Long-Range Evolutionary Constraints in Enzymes , 2016, PLoS biology.

[43]  U. Hofmann,et al.  NUDT15 polymorphisms alter thiopurine metabolism and hematopoietic toxicity , 2016, Nature Genetics.

[44]  Claus O. Wilke,et al.  Causes of evolutionary rate variation among protein sites , 2016, Nature Reviews Genetics.

[45]  Angela D. Wilkins,et al.  UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures , 2015, Nucleic Acids Res..

[46]  Thomas A. Hopf,et al.  Quantification of the effect of mutations using a global probability model of natural sequence variation , 2015, 1510.04612.

[47]  Byung-Kwan Cho,et al.  Rational Protein Engineering Guided by Deep Mutational Scanning , 2015, International journal of molecular sciences.

[48]  Yongwook Choi,et al.  PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels , 2015, Bioinform..

[49]  T. Helleday,et al.  Crystal structure, biochemical and cellular activities demonstrate separate functions of MTH1 and MTH2 , 2015, Nature Communications.

[50]  J. Skolnick,et al.  Insights into Disease-Associated Mutations in the Human Proteome through Protein Structural Analysis. , 2015, Structure.

[51]  David L. Young,et al.  Massively Parallel Functional Analysis of BRCA1 RING Domain Variants , 2015, Genetics.

[52]  Gert Vriend,et al.  A series of PDB related databases for everyday needs , 2010, Nucleic Acids Res..

[53]  Jianjun Liu,et al.  A common missense variant in NUDT15 confers susceptibility to thiopurine-induced leukopenia , 2014, Nature Genetics.

[54]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[55]  Magnus Ekeberg,et al.  Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences , 2014, J. Comput. Phys..

[56]  David Baker,et al.  Computational design of ligand-binding proteins with high affinity and selectivity , 2013, Nature.

[57]  Guido Tiana,et al.  The network of stabilizing contacts in proteins studied by coevolutionary data. , 2013, The Journal of chemical physics.

[58]  Konstantin B. Zeldovich,et al.  Latent Effects of Hsp90 Mutants Revealed at Reduced Expression Levels , 2013, PLoS genetics.

[59]  K. Huntoon,et al.  Proteostasis modulators prolong missense VHL protein activity and halt tumor progression. , 2013, Cell reports.

[60]  A. Gammie,et al.  Proteasome inhibition rescues clinically significant unstable variants of the mismatch repair protein Msh2 , 2012, Proceedings of the National Academy of Sciences.

[61]  E. Aurell,et al.  Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[62]  S. Fields,et al.  A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function , 2012, Proceedings of the National Academy of Sciences.

[63]  Christopher Jarzynski,et al.  Using Sequence Alignments to Predict Protein Structure and Stability With High Accuracy , 2012, 1207.2484.

[64]  Lucy J. Colwell,et al.  The interface of protein structure, protein biophysics, and molecular evolution , 2012, Protein science : a publication of the Protein Society.

[65]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[66]  Joaquín Dopazo,et al.  SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants , 2011, Nucleic Acids Res..

[67]  Piero Fariselli,et al.  Correlating disease‐related mutations to their effect on protein stability: A large‐scale analysis of the human proteome , 2011, Human mutation.

[68]  D. Bolon,et al.  Experimental illumination of a fitness landscape , 2011, Proceedings of the National Academy of Sciences.

[69]  R. Brady,et al.  Missense mutations in the NF2 gene result in the quantitative loss of merlin protein and minimally affect protein intrinsic function , 2011, Proceedings of the National Academy of Sciences.

[70]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[71]  Dan S. Tawfik,et al.  Chaperonin overexpression promotes genetic variation and enzyme evolution , 2009, Nature.

[72]  J. Moult,et al.  Loss of protein structure stability as a major causative factor in monogenic disease. , 2005, Journal of molecular biology.

[73]  D. Baker,et al.  Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design , 2005, Nucleic acids research.

[74]  M. DePristo,et al.  Missense meanderings in sequence space: a biophysical view of protein evolution , 2005, Nature Reviews Genetics.

[75]  M. Horowitz,et al.  ER retention and degradation as the molecular basis underlying Gaucher disease heterogeneity , 2005 .

[76]  S. Antonarakis,et al.  Binding of PTEN to Specific PDZ Domains Contributes to PTEN Protein Stability and Phosphorylation by Microtubule-associated Serine/Threonine Kinases* , 2005, Journal of Biological Chemistry.

[77]  Maho Takahashi,et al.  Menin Missense Mutants Associated with Multiple Endocrine Neoplasia Type 1 Are Rapidly Degraded via the Ubiquitin-Proteasome Pathway , 2004, Molecular and Cellular Biology.

[78]  Keith D Wilkinson,et al.  Familial Parkinson's Disease-associated L166P Mutation Disrupts DJ-1 Protein Folding and Function* , 2004, Journal of Biological Chemistry.

[79]  J. Thornton,et al.  Molecular basis of inherited diseases: a structural perspective. , 2003, Trends in genetics : TIG.

[80]  M. Orozco,et al.  Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. , 2002, Journal of molecular biology.

[81]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[82]  Tomohiko Maehama,et al.  Crystal Structure of the PTEN Tumor Suppressor Implications for Its Phosphoinositide Phosphatase Activity and Membrane Association , 1999, Cell.

[83]  L. Mirny,et al.  Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. , 1999, Journal of molecular biology.

[84]  S. Lindquist,et al.  Hsp90 as a capacitor for morphological evolution , 1998, Nature.

[85]  A. Fersht Structure and mechanism in protein science , 1998 .

[86]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[87]  B K Shoichet,et al.  A relationship between protein stability and protein function. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[88]  B. Rost,et al.  Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[89]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[90]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[91]  K. Lindorff-Larsen,et al.  Protein destabilization and degradation as a mechanism for hereditary disease , 2020 .

[92]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[93]  P. Karran,et al.  Thiopurines in current medical practice: molecular mechanisms and contributions to therapy-related cancer , 2008, Nature Reviews Cancer.

[94]  D. Cyr,et al.  The Hsc70 co-chaperone CHIP targets immature CFTR for proteasomal degradation , 2000, Nature Cell Biology.