Machine Learning for Prioritization of Thermostabilizing Mutations for G-protein Coupled Receptors

Although the three-dimensional structures of G-protein-coupled receptors (GPCRs), the largest superfamily of drug targets, have enabled structure-based drug design, there are no structures available for 87% of GPCRs. This is due to the stiff challenge in purifying the inherently flexible GPCRs. Identifying thermostabilized mutant GPCRs via systematic alanine scanning mutations has been a successful strategy in stabilizing GPCRs, but it remains a daunting task for each GPCR. We developed a computational method that combines sequence, structure and dynamics based molecular properties of GPCRs that recapitulate GPCR stability, with four different machine learning methods to predict thermostable mutations ahead of experiments. This method has been trained on thermostability data for 1231 mutants, the largest publicly available dataset. A blind prediction for thermostable mutations of the Complement factor C5a Receptor retrieved 36% of the thermostable mutants in the top 50 prioritized mutants compared to 3% in the first 50 attempts using systematic alanine scanning. Statement Of Signifigance G-protein-coupled receptors (GPCRs), the largest superfamily of membrane proteins play a vital role in cellular physiology and are targets to blockbuster drugs. Hence it is imperative to solve the three dimensional structures of GPCRs in various conformational states with different types of ligands bound. To reduce the experimental burden in identifying thermostable GPCR mutants, we report a computational framework using machine learning algorithms trained on thermostability data for 1231 mutants and features calculated from analysis of GPCR sequences, structure and dynamics to predict thermostable mutations ahead of experiments. This work represents a significant advancement in the development, validation and testing of a computational framework that can be extended to other class A GPCRs and helical membrane proteins.

[1]  Structural insights into the subtype-selective antagonist binding to the M2muscarinic receptor , 2018 .

[2]  N. Vaidehi,et al.  Structural insights into the subtype-selective antagonist binding to the M2 muscarinic receptor , 2018, Nature Chemical Biology.

[3]  N. Vaidehi,et al.  Engineering Salt Bridge Networks between Transmembrane Helices Confers Thermostability in G-Protein-Coupled Receptors. , 2018, Journal of chemical theory and computation.

[4]  Arthur Christopoulos,et al.  Structural insights into G-protein-coupled receptor allostery , 2018, Nature.

[5]  W. Weis,et al.  The Molecular Basis of G Protein-Coupled Receptor Activation. , 2018, Annual review of biochemistry.

[6]  G. Bottegoni,et al.  Structure of the complement C5a receptor bound to the extra-helical antagonist NDT9513727 , 2018, Nature.

[7]  M. Kayikci,et al.  Protein contacts atlas: visualization and analysis of non-covalent contacts in biomolecules , 2018, Nature Structural & Molecular Biology.

[8]  Sabri Boughorbel,et al.  Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric , 2017, PloS one.

[9]  Roland L. Dunbrack,et al.  The Rosetta all-atom energy function for macromolecular modeling and design , 2017, bioRxiv.

[10]  Aboul Ella Hassanien,et al.  Linear discriminant analysis: A detailed tutorial , 2017, AI Commun..

[11]  N. Vaidehi,et al.  Structure and dynamics of a constitutively active neurotensin receptor , 2016, Scientific Reports.

[12]  Alaa Tharwat,et al.  Principal component analysis - a tutorial , 2016, Int. J. Appl. Pattern Recognit..

[13]  David E. Gloriam,et al.  Editorial overview: New technologies: GPCR drug design and function-exploiting the current (of) structures. , 2016, Current opinion in pharmacology.

[14]  N. Vaidehi,et al.  How Can Mutations Thermostabilize G-Protein-Coupled Receptors? , 2016, Trends in pharmacological sciences.

[15]  Brian D. Weitzner,et al.  An Integrated Framework Advancing Membrane Protein Modeling and Design , 2015, PLoS Comput. Biol..

[16]  Jianyi Yang,et al.  GPCR-I-TASSER: A Hybrid Approach to G Protein-Coupled Receptor Structure Modeling and the Application to the Human Genome. , 2015, Structure.

[17]  N. Vaidehi,et al.  Structural Dynamics and Thermostabilization of Neurotensin Receptor 1 , 2015, The journal of physical chemistry. B.

[18]  Ruben Abagyan,et al.  Crystal structure of the chemokine receptor CXCR4 in complex with a viral chemokine , 2015, Science.

[19]  Chris de Graaf,et al.  Generic GPCR residue numbers - aligning topology maps while minding the gaps. , 2015, Trends in pharmacological sciences.

[20]  Christopher G. Tate,et al.  Rapid Computational Prediction of Thermostabilizing Mutations for G Protein-Coupled Receptors , 2014, Journal of chemical theory and computation.

[21]  Debora S. Marks,et al.  Sequence co-evolution gives 3D contacts and structures of protein complexes , 2014, bioRxiv.

[22]  N. Vaidehi,et al.  Dynamic Behavior of the Active and Inactive States of the Adenosine A2A Receptor , 2014, The journal of physical chemistry. B.

[23]  Hualiang Jiang,et al.  Structure of the CCR5 Chemokine Receptor–HIV Entry Inhibitor Maraviroc Complex , 2013, Science.

[24]  R. Dror,et al.  The role of ligands on the equilibria between functional states of a G protein-coupled receptor. , 2013, Journal of the American Chemical Society.

[25]  N. Vaidehi,et al.  Thermostabilization of the β1-adrenergic receptor correlates with increased entropy of the inactive state. , 2013, The journal of physical chemistry. B.

[26]  C. Tate,et al.  Optimising the combination of thermostabilising mutations in the neurotensin receptor for structure determination. , 2013, Biochimica et biophysica acta.

[27]  M. Mostafizur Rahman,et al.  Addressing the Class Imbalance Problem in Medical Datasets , 2013 .

[28]  S. Opella,et al.  Structure of the Chemokine Receptor CXCR1 in Phospholipid Bilayers , 2012, Nature.

[29]  Christopher G Tate,et al.  A crystal clear solution for determining G-protein-coupled receptor structures. , 2012, Trends in biochemical sciences.

[30]  Nagarajan Vaidehi,et al.  LITiCon: a discrete conformational sampling computational method for mapping various functionally selective conformational states of transmembrane helical proteins. , 2012, Methods in molecular biology.

[31]  Nagarajan Vaidehi,et al.  The role of conformational ensembles in ligand recognition in G-protein coupled receptors. , 2011, Journal of the American Chemical Society.

[32]  C. Tate,et al.  Thermostabilisation of an Agonist-Bound Conformation of the Human Adenosine A2A Receptor , 2011, Journal of molecular biology.

[33]  N. Vaidehi,et al.  Structural insights into conformational stability of wild-type and mutant beta1-adrenergic receptor. , 2010, Biophysical journal.

[34]  Nagarajan Vaidehi,et al.  Computational mapping of the conformational transitions in agonist selective pathways of a G-protein coupled receptor. , 2010, Journal of the American Chemical Society.

[35]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[36]  C. Tate,et al.  Transferability of thermostabilizing mutations between β-adrenergic receptors , 2009, Molecular membrane biology.

[37]  C. Tate,et al.  Engineering G protein-coupled receptors to facilitate their structure determination. , 2009, Current opinion in structural biology.

[38]  Jianpeng Ma,et al.  CHARMM: The biomolecular simulation program , 2009, J. Comput. Chem..

[39]  Yoko Shibata,et al.  Thermostabilisation of the neurotensin receptor NTS1 , 2009, Journal of molecular biology.

[40]  Trevor Hastie,et al.  Boosting and Additive Trees , 2009 .

[41]  Yoko Shibata,et al.  Co-evolving stability and conformational homogeneity of the human adenosine A2a receptor , 2008, Proceedings of the National Academy of Sciences.

[42]  Nagarajan Vaidehi,et al.  Ligand-stabilized conformational states of human beta(2) adrenergic receptor: insight into G-protein-coupled receptor activation. , 2008, Biophysical journal.

[43]  Yoko Shibata,et al.  Conformational thermostabilization of the β1-adrenergic receptor in a detergent-resistant form , 2008, Proceedings of the National Academy of Sciences.

[44]  Ravinder Abrol,et al.  Predictions of CCR1 Chemokine Receptor Structure and BX 471 Antagonist Binding Followed by Experimental Validation* , 2006, Journal of Biological Chemistry.

[45]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[46]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[47]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[48]  Yaxin Bi,et al.  KNN Model-Based Approach in Classification , 2003, OTM.

[49]  Franca Fraternali,et al.  POPS: a fast algorithm for solvent accessible surface areas at atomic and residue level , 2003, Nucleic Acids Res..

[50]  Robert Meersman,et al.  On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE , 2003, Lecture Notes in Computer Science.

[51]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[52]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[53]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[54]  J. Ballesteros,et al.  [19] Integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in G protein-coupled receptors , 1995 .

[55]  P. Conn Methods in neurosciences , 1991 .

[56]  I. Jolliffe Mathematical and Statistical Properties of Sample Principal Components , 1986 .