Computational design of soluble analogues of integral membrane protein structures

De novo design of complex protein folds using solely computational means remains a significant challenge. Here, we use a robust deep learning pipeline to design complex folds and soluble analogues of integral membrane proteins. Unique membrane topologies, such as the GPCRs, are not found in the soluble proteome and we demonstrate that their structural features can be recapitulated in solution. Biophysical analyses reveal high thermal stability of the designs and experimental structures show remarkable design accuracy. Transplantation of native structural motifs demonstrates their potential for further functionalization as novel small molecule binders and new approaches to drug discovery. In summary, our workflow enables the design of complex protein topologies with high experimental success rates and low sequence similarity to natural proteins, leading to a de facto expansion of the soluble fold space.

[1]  G. Gao,et al.  De novo design of protein interactions with learned surface fingerprints , 2023, Nature.

[2]  S. Ovchinnikov,et al.  Efficient and scalable de novo protein design using a relaxed sequence space , 2023, bioRxiv.

[3]  Valentin De Bortoli,et al.  SE(3) diffusion model with application to protein backbone generation , 2023, ICML.

[4]  Mohammed AlQuraishi,et al.  Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds , 2023, ICML.

[5]  Alexander Rives,et al.  Language models generalize beyond natural proteins , 2022, bioRxiv.

[6]  Hamed Khakzad,et al.  De novo protein design by inversion of the AlphaFold structure prediction network , 2022, bioRxiv.

[7]  Brian L. Trippe,et al.  Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models , 2022, bioRxiv.

[8]  D. Hilvert,et al.  Design and optimization of enzymatic activity in a de novo β‐barrel scaffold , 2022, Protein science : a publication of the Protein Society.

[9]  Raphael R. Eguchi,et al.  De Novo Design of a Highly Stable Ovoid TIM Barrel: Unlocking Pocket Shape towards Functional Design , 2022, Biodesign research.

[10]  Rianne van den Berg,et al.  Protein structure generation via folding diffusion , 2022, Nature communications.

[11]  L. Carter,et al.  Hallucinating symmetric protein assemblies , 2022, Science.

[12]  S. Ovchinnikov,et al.  Scaffolding protein functional sites using deep learning , 2022, Science.

[13]  Brian L. Trippe,et al.  Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem , 2022, ICLR.

[14]  B. Sankaran,et al.  Robust deep learning based protein sequence design using ProteinMPNN , 2022, bioRxiv.

[15]  Tudor Achim,et al.  Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models , 2022, ArXiv.

[16]  S. Liao,et al.  A backbone-centred energy function of neural networks for protein design , 2022, Nature.

[17]  D. Baker,et al.  De novo design of immunoglobulin-like domains , 2021, bioRxiv.

[18]  J. Korbel,et al.  AlphaDesign: A de novo protein design framework based on AlphaFold , 2021, bioRxiv.

[19]  David T. Jones,et al.  Using AlphaFold for Rapid and Accurate Fixed Backbone Protein Design , 2021, bioRxiv.

[20]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[21]  A. Myasnikov,et al.  Structural basis of human ghrelin receptor signaling by ghrelin and the synthetic agonist ibutamoren , 2021, Nature Communications.

[22]  D. Woolfson A brief history of de novo protein design: minimal, rational, and computational. , 2021, Journal of molecular biology.

[23]  Gyu Rie Lee,et al.  Accurate prediction of protein structures and interactions using a 3-track neural network , 2021, Science.

[24]  David E. Kim,et al.  Protein sequence design by conformational landscape optimization , 2021, Proceedings of the National Academy of Sciences.

[25]  Birte Höcker,et al.  Evolution, folding, and design of TIM barrels and related proteins , 2021, Current opinion in structural biology.

[26]  Radka Svobodová Vareková,et al.  CATH: increased structural coverage of functional space , 2020, Nucleic Acids Res..

[27]  D. Baker,et al.  De novo design of transmembrane β-barrels , 2020, bioRxiv.

[28]  Conrad C. Huang,et al.  UCSF ChimeraX: Structure visualization for researchers, educators, and developers , 2020, Protein science : a publication of the Protein Society.

[29]  S. Iwata,et al.  Structure of an antagonist-bound ghrelin receptor reveals possible ghrelin recognition mode , 2020, Nature Communications.

[30]  David Baker,et al.  De novo protein design by deep network hallucination , 2020, Nature.

[31]  Brian D. Weitzner,et al.  Macromolecular modeling and design in Rosetta: recent methods and frameworks , 2020, Nature Methods.

[32]  Po-Ssu Huang,et al.  Protein sequence design with a learned potential , 2020, bioRxiv.

[33]  J. Gough,et al.  The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures , 2019, Nucleic Acids Res..

[34]  Christopher J. Williams,et al.  Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix , 2019, Acta crystallographica. Section D, Structural biology.

[35]  F. Fraternali,et al.  Lipid Head Group Parameterization for GROMOS 54A8: A Consistent Approach with Protein Force Field Description , 2019, Journal of chemical theory and computation.

[36]  J. Kaduk Structure validation , 2019, International Tables for Crystallography.

[37]  Regina Barzilay,et al.  Generative Models for Graph-Based Protein Design , 2019, DGS@ICLR.

[38]  M. Zacharias,et al.  A single residue switch reveals principles of antibody domain integrity , 2018, The Journal of Biological Chemistry.

[39]  D. Baker,et al.  De novo design of a non-local β-sheet protein with high stability and accuracy , 2018, Nature Structural & Molecular Biology.

[40]  A. Tichá,et al.  The Rhomboid Superfamily: Structural Mechanisms and Chemical Biology Opportunities. , 2018, Trends in biochemical sciences.

[41]  William Sheffler,et al.  De novo design of a fluorescence-activating β-barrel , 2018, Nature.

[42]  M. Congreve,et al.  Structure-Based Optimization Strategies for G Protein-Coupled Receptor (GPCR) Allosteric Modulators: A Case Study from Analyses of New Metabotropic Glutamate Receptor 5 (mGlu5) X-ray Structures. , 2018, Journal of medicinal chemistry.

[43]  David E. Gloriam,et al.  Pharmacogenomics of GPCR Drug Targets , 2018, Cell.

[44]  Christopher J. Williams,et al.  MolProbity: More and better reference data for improved all‐atom structure validation , 2018, Protein science : a publication of the Protein Society.

[45]  David E. Gloriam,et al.  Trends in GPCR drug discovery: new agents, targets and indications , 2017, Nature Reviews Drug Discovery.

[46]  David Baker,et al.  Computational design of environmental sensors for the potent opioid fentanyl , 2017, eLife.

[47]  Hiroshi Suzuki,et al.  Crystal structures of claudins: insights into their intermolecular interactions , 2017, Annals of the New York Academy of Sciences.

[48]  J. Cochran,et al.  Emerging Strategies for Developing Next-Generation Protein Therapeutics for Cancer Treatment. , 2016, Trends in pharmacological sciences.

[49]  D. Baker,et al.  De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy , 2015, Nature chemical biology.

[50]  D. Nagarajan,et al.  Design of symmetric TIM barrel proteins from first principles , 2015, BMC Biochemistry.

[51]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[52]  A. Plückthun,et al.  Engineered proteins with desired specificity: DARPins, other alternative scaffolds and bispecific IgGs. , 2014, Current opinion in structural biology.

[53]  O. Nureki,et al.  Crystal Structure of a Claudin Provides Insight into the Architecture of Tight Junctions , 2014, Science.

[54]  David Baker,et al.  Computational design of ligand-binding proteins with high affinity and selectivity , 2013, Nature.

[55]  Jeffery G. Saven,et al.  A Computationally Designed Water-Soluble Variant of a G-Protein-Coupled Receptor: The Human Mu Opioid Receptor , 2013, PloS one.

[56]  Arwin J. Brouwer,et al.  Activity-based probes for rhomboid proteases discovered in a mass spectrometry-based assay , 2013, Proceedings of the National Academy of Sciences.

[57]  R. Stevens,et al.  Structure-function of the G protein-coupled receptor superfamily. , 2013, Annual review of pharmacology and toxicology.

[58]  Chris Oostenbrink,et al.  Testing of the GROMOS Force-Field Parameter Set 54A8: Structural Properties of Electrolyte Solutions, Lipid Bilayers, and Proteins , 2013, Journal of chemical theory and computation.

[59]  A. Dunker,et al.  Proline Rich Motifs as Drug Targets in Immune Mediated Disorders , 2012, International journal of peptides.

[60]  A. Sharff,et al.  Data processing and analysis with the autoPROC toolbox , 2011, Acta crystallographica. Section D, Biological crystallography.

[61]  P. Emsley,et al.  Features and development of Coot , 2010, Acta crystallographica. Section D, Biological crystallography.

[62]  R. Dror,et al.  Improved side-chain torsion potentials for the Amber ff99SB protein force field , 2010, Proteins.

[63]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[64]  Colin A. Smith,et al.  Backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction. , 2008, Journal of molecular biology.

[65]  Eric A. Althoff,et al.  Kemp elimination catalysts by computational enzyme design , 2008, Nature.

[66]  Valérie Capra,et al.  The Highly Conserved DRY Motif of Class A G Protein-Coupled Receptors: Beyond the Ground State , 2007, Molecular Pharmacology.

[67]  M. Parrinello,et al.  Canonical sampling through velocity rescaling. , 2007, The Journal of chemical physics.

[68]  V. Hornak,et al.  Comparison of multiple Amber force fields and development of improved protein backbone parameters , 2006, Proteins.

[69]  Yongcheng Wang,et al.  Crystal structure of a rhomboid family intramembrane protease , 2006, Nature.

[70]  R. Sterner,et al.  Catalytic Versatility, Stability, and Evolution of the (βα)8‐Barrel Enzyme‐Fold , 2006 .

[71]  Holger Gohlke,et al.  The Amber biomolecular simulation programs , 2005, J. Comput. Chem..

[72]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[73]  Jeffery G. Saven,et al.  Computational design of water-soluble analogues of the potassium channel KcsA , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[74]  Dennis R. Burton,et al.  Human antibody–Fc receptor interactions illuminated by crystal structures , 2004, Nature Reviews Immunology.

[75]  T. Steitz,et al.  Rational design of 'water-soluble' bacteriorhodopsin variants. , 2002, Protein engineering.

[76]  M. Freeman,et al.  Drosophila Rhomboid-1 Defines a Family of Putative Intramembrane Serine Proteases , 2001, Cell.

[77]  Berk Hess,et al.  GROMACS 3.0: a package for molecular simulation and trajectory analysis , 2001 .

[78]  T. Steitz,et al.  Conversion of phospholamban into a soluble pentameric helical bundle. , 2001, Biochemistry.

[79]  J. Gerlt,et al.  New wine from old barrels , 2000, Nature Structural Biology.

[80]  A. Poupon,et al.  The immunoglobulin fold family: sequence analysis and 3D structure comparisons. , 1999, Protein engineering.

[81]  K. Konvička,et al.  A proposed structure for transmembrane segment 7 of G protein-coupled receptors incorporating an asn-Pro/Asp-Pro motif. , 1998, Biophysical journal.

[82]  Berk Hess,et al.  LINCS: A linear constraint solver for molecular simulations , 1997, J. Comput. Chem..

[83]  Alain Carpy,et al.  MLPP: A Program for the Calculation of Molecular Lipophilicity Potential in Proteins , 1997 .

[84]  T. Darden,et al.  A smooth particle mesh Ewald method , 1995 .

[85]  R. Abagyan,et al.  Second-generation octarellins: two new de novo (beta/alpha)8 polypeptides designed for investigating the influence of beta-residue packing on the alpha/beta-barrel structure stability. , 1995, Protein engineering.

[86]  P Bork,et al.  The immunoglobulin fold. Structural classification, sequence patterns and common core. , 1994, Journal of molecular biology.

[87]  M. Hecht,et al.  De novo design of beta-sheet proteins. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[88]  Hoover,et al.  Canonical dynamics: Equilibrium phase-space distributions. , 1985, Physical review. A, General physics.

[89]  S. Nosé A unified formulation of the constant temperature molecular dynamics methods , 1984 .

[90]  M. Parrinello,et al.  Polymorphic transitions in single crystals: A new molecular dynamics method , 1981 .

[91]  Martin Steinegger,et al.  Foldseek: fast and accurate protein structure search , 2022 .

[92]  Jian Li Targeting claudins in cancer: diagnosis, prognosis and therapy. , 2021, American journal of cancer research.

[93]  Justin A. Lemkul From Proteins to Perturbed Hamiltonians: A Suite of Tutorials for the GROMACS-2018 Molecular Simulation Package [Article v1.0] , 2019, Living Journal of Computational Molecular Science.

[94]  A. F. Williams,et al.  The immunoglobulin superfamily--domains for cell surface recognition. , 1988, Annual review of immunology.

[95]  G. Richards Intermolecular forces , 1978, Nature.