Machine Learning to Identify Flexibility Signatures of Class A GPCR Inhibition

We show that machine learning can pinpoint features distinguishing inactive from active states in proteins, in particular identifying key ligand binding site flexibility transitions in GPCRs that are triggered by biologically active ligands. Our analysis was performed on the helical segments and loops in 18 inactive and 9 active class A GPCRs. These 3-dimensional structures were determined in complex with ligands. However, considering the flexible versus rigid state identified by graph-theoretic ProFlex rigidity analysis for each helix and loop segment with the ligand removed, followed by feature selection and k-nearest neighbor classification, was sufficient to identify four segments surrounding the ligand binding site whose flexibility/rigidity accurately predicts whether a GPCR is in an active or inactive state. GPCRs bound to inhibitors were similar in their pattern of flexible versus rigid regions, whereas agonist-bound GPCRs were more flexible and diverse. This new ligand-proximal flexibility signature of GPCR activity was identified without knowledge of the ligand binding mode or previously defined switch regions, while being adjacent to the known transmission switch. Following this proof of concept, the ProFlex flexibility analysis coupled with pattern recognition and activity classification may be useful for predicting whether newly designed ligands behave as activators or inhibitors, based on the pattern of flexibility they induce in the protein.

[1]  Liisa Holm,et al.  Dali server update , 2016, Nucleic Acids Res..

[2]  M. Babu,et al.  Molecular signatures of G-protein-coupled receptors , 2013, Nature.

[3]  A. Rader,et al.  Identifying protein folding cores from the evolution of flexible regions during unfolding. , 2002, Journal of molecular graphics & modelling.

[4]  Sebastian Raschka,et al.  Detecting the native ligand orientation by interfacial rigidity: SiteInterlock , 2016, Proteins.

[5]  David E. Gloriam,et al.  Trends in GPCR drug discovery: new agents, targets and indications , 2017, Nature Reviews Drug Discovery.

[6]  Anjali Rohatgi,et al.  (www.interscience.wiley.com) DOI:10.1002/jmr.942 Scoring ligand similarity in structure-based virtual screening , 2022 .

[7]  W R Taylor,et al.  A model recognition approach to the prediction of all-helical membrane protein structure and topology. , 1994, Biochemistry.

[8]  Patrick Scheerer,et al.  Crystal structure of the ligand-free G-protein-coupled receptor opsin , 2008, Nature.

[9]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10]  Tjerk P. Straatsma,et al.  NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations , 2010, Comput. Phys. Commun..

[11]  P. Hawkins,et al.  Comparison of shape-matching and docking as virtual screening tools. , 2007, Journal of medicinal chemistry.

[12]  D. Jacobs,et al.  Protein flexibility predictions using graph theory , 2001, Proteins.

[13]  Sebastian Raschka,et al.  MLxtend: Providing machine learning and data science utilities and extensions to Python's scientific computing stack , 2018, J. Open Source Softw..

[14]  B. Trzaskowski,et al.  Action of Molecular Switches in GPCRs - Theoretical and Experimental Studies , 2012, Current medicinal chemistry.

[15]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[16]  Xavier Deupi,et al.  Conformational complexity of G-protein-coupled receptors. , 2007, Trends in pharmacological sciences.

[17]  B. Hendrickson,et al.  An Algorithm for Two-Dimensional Rigidity Percolation , 1997 .

[18]  Sebastian Raschka,et al.  Machine learning-assisted discovery of GPCR bioactive ligands , 2018 .

[19]  P. Pudil,et al.  of Techniques for Large-Scale Feature Selection , 1994 .

[20]  Kenichiro Koga,et al.  The hydrophobic effect , 2003 .

[21]  William L Jorgensen,et al.  Efficient drug lead discovery and optimization. , 2009, Accounts of chemical research.

[22]  David S. Goodsell,et al.  The RCSB protein data bank: integrative view of protein, gene and 3D structural information , 2016, Nucleic Acids Res..

[23]  Naomi R. Latorraca,et al.  GPCR Dynamics: Structures in Motion. , 2017, Chemical reviews.

[24]  J. Maxwell,et al.  The Scientific Papers of James Clerk Maxwell: On the Calculation of the Equilibrium and Stiffness of Frames , 1864 .

[25]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[26]  Keehyoung Joo,et al.  Improving physical realism, stereochemistry, and side‐chain accuracy in homology modeling: Four approaches that performed well in CASP8 , 2009, Proteins.

[27]  R. Stevens,et al.  Structure-function of the G protein-coupled receptor superfamily. , 2013, Annual review of pharmacology and toxicology.