Elucidating Which Pairwise Mutations Affect Protein Stability: An Exhaustive Big Data Approach

The specific sequence of amino acids in a polypeptide chain dictates the three dimensional structure, and hence function, of a protein. Mutagenesis experiments on physical proteins involving amino acid substitutions provide insights enabling pharmaceutical companies to design medicines to combat a variety of debilitating diseases. However such wet lab work is prohibitive, because even studying the effects of a single mutation may require weeks of work. Computational approaches for performing exhaustive screens of the effects of single mutations have been developed, but methods for conducting a systematic, exhaustive screen of the effects of all multiple mutations are not available due to the large number of mutant protein structures that would need to be analyzed. In this work we motivate and demonstrate a proof of concept approach for conducting in silico experiments in which we generate all possible mutant structures with 2 amino acid substitutions for three proteins with 46, 67, and 99 residues; for the largest protein we in silico generate 1,751,211 mutants. We leverage an efficient combinatorial algorithm to assess the effects of the mutations among the mutant protein structures. We also produce heat maps for several mutation metrics to facilitate identifying which pairs of amino acid in a protein have the greatest impact on protein stability based on how those amino acid substitutions affect the protein's flexibility.

[1]  Erik Andersson,et al.  Assessing how multiple mutations affect protein stability using rigid cluster size distributions , 2016, 2016 IEEE 6th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

[2]  T L Blundell,et al.  Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables. , 1997, Protein engineering.

[3]  D. Jacobs,et al.  Protein flexibility predictions using graph theory , 2001, Proteins.

[4]  Ileana Streinu,et al.  Using rigidity analysis to probe mutation-induced structural changes in proteins , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[5]  R. Shafer,et al.  HIV-1 Protease Mutations and Protease Inhibitor Cross-Resistance , 2010, Antimicrobial Agents and Chemotherapy.

[6]  Nurit Haspel,et al.  An Evolutionary Conservation & Rigidity Analysis Machine Learning Approach for Detecting Critical Protein Residues , 2013, BCB.

[7]  Sean D. Mooney,et al.  Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis , 2005, Briefings Bioinform..

[8]  Yang Zhang,et al.  Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles , 2015, PLoS Comput. Biol..

[9]  Iosif I. Vaisman,et al.  Structure-based prediction of protein activity changes: Assessing the impact of single residue replacements , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[10]  Lei Jia,et al.  Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools , 2015, PloS one.

[11]  D Gilis,et al.  Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence. , 1997, Journal of molecular biology.

[12]  Brian Hutchinson,et al.  Predicting the Effect of Point Mutations on Protein Structural Stability , 2017, BCB.

[13]  M. Levitt,et al.  Accurate prediction of the stability and activity effects of site-directed mutagenesis on a protein core , 1991, Nature.

[14]  H. Gohlke,et al.  Exploiting the Link between Protein Rigidity and Thermostability for Data‐Driven Protein Engineering , 2008 .

[15]  Roland L. Dunbrack,et al.  Conformational analysis of the backbone-dependent rotamer preferences of protein sidechains , 1994, Nature Structural Biology.

[16]  S. Henikoff,et al.  Predicting the effects of amino acid substitutions on protein function. , 2006, Annual review of genomics and human genetics.

[17]  Yang Li,et al.  KINARI-Web: a server for protein rigidity analysis , 2011, Nucleic Acids Res..

[18]  Ursula Rothlisberger,et al.  Drug resistance in HIV‐1 protease: Flexibility‐assisted mechanism of compensatory mutations , 2002, Protein science : a publication of the Protein Society.

[19]  Barry S. Coller,et al.  Structural basis for allostery in integrins and binding to fibrinogen-mimetic therapeutics , 2004, Nature.

[20]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.

[21]  William Lee,et al.  Analytical methods for inferring functional effects of single base pair substitutions in human cancers , 2009, Human Genetics.

[22]  Jianwen Fang,et al.  PROTS-RF: A Robust Model for Predicting Mutation-Induced Protein Stability Changes , 2012, PloS one.

[23]  Erik Andersson,et al.  ProMuteHT: A High Throughput Compute Pipeline for Generating Protein Mutants in silico , 2017, BCB.

[24]  E. Alexov,et al.  Approaches and resources for prediction of the effects of non-synonymous single nucleotide polymorphism on protein function and interactions. , 2008, Current pharmaceutical biotechnology.

[25]  Piero Fariselli,et al.  A neural-network-based method for predicting protein stability changes upon single point mutations , 2004, ISMB/ECCB.

[26]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[27]  M. Levitt,et al.  Conformation of amino acid side-chains in proteins. , 1978, Journal of molecular biology.