End-to-End Full-Atom Antibody Design

Antibody design is an essential yet challenging task in various domains like therapeutics and biology. There are two major defects in current learning-based methods: 1) tackling only a certain subtask of the whole antibody design pipeline, making them suboptimal or resource-intensive. 2) omitting either the framework regions or side chains, thus incapable of capturing the full-atom geometry. To address these pitfalls, we propose dynamic Multi-channel Equivariant grAph Network (dyMEAN), an end-to-end full-atom model for E(3)-equivariant antibody design given the epitope and the incomplete sequence of the antibody. Specifically, we first explore structural initialization as a knowledgeable guess of the antibody structure and then propose shadow paratope to bridge the epitope-antibody connections. Both 1D sequences and 3D structures are updated via an adaptive multi-channel equivariant encoder that is able to process protein residues of variable sizes when considering full atoms. Finally, the updated antibody is docked to the epitope via the alignment of the shadow paratope. Experiments on epitope-binding CDR-H3 design, complex structure prediction, and affinity optimization demonstrate the superiority of our end-to-end framework and full-atom modeling.

[1]  Jian Peng,et al.  Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures , 2022, bioRxiv.

[2]  Wenbing Huang,et al.  Conditional Antibody Design as 3D Equivariant Graph Translation , 2022, ICLR.

[3]  Jianzhu Ma,et al.  Proximal Exploration for Model-guided Protein Sequence Design , 2022, bioRxiv.

[4]  Junzhou Huang,et al.  Equivariant Graph Mechanics Networks with Constraints , 2022, ICLR.

[5]  S. Ermon,et al.  GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation , 2022, ICLR.

[6]  Jian Peng,et al.  Deep learning guided optimization of human antibody against SARS-CoV-2 variants with broad neutralization , 2022, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Jeffrey J. Gray,et al.  Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies , 2022, bioRxiv.

[8]  T. Jaakkola,et al.  Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking , 2021, ICLR.

[9]  T. Jaakkola,et al.  Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design , 2021, ICLR.

[10]  T. Jaakkola,et al.  Antibody-Antigen Docking and Design via Hierarchical Structure Refinement , 2022, ICML.

[11]  Jeffrey J. Gray,et al.  Deciphering antibody affinity maturation with language models and weakly supervised learning , 2021, ArXiv.

[12]  D. Hassabis,et al.  Protein complex prediction with AlphaFold-Multimer , 2021, bioRxiv.

[13]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[14]  Cédric R. Weber,et al.  In silico proof of principle of machine learning-based antibody design at unconstrained scale , 2021, bioRxiv.

[15]  S. Metsugi,et al.  Antibody design using LSTM based deep generative model from phage display library for affinity maturation , 2021, Scientific Reports.

[16]  Max Welling,et al.  E(n) Equivariant Graph Neural Networks , 2021, ICML.

[17]  Fabian B. Fuchs,et al.  SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks , 2020, NeurIPS.

[18]  Jiahua He,et al.  The HDOCK server for integrated protein–protein docking , 2020, Nature Protocols.

[19]  Nicholas C. Wu,et al.  A highly conserved cryptic epitope in the receptor binding domains of SARS-CoV-2 and SARS-CoV , 2020, Science.

[20]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[21]  Ziheng Wang,et al.  Antibody complementarity determining region design using high-capacity machine learning , 2019, bioRxiv.

[22]  Jiye Shi,et al.  Five computational developability guidelines for therapeutic antibody profiling , 2019, Proceedings of the National Academy of Sciences.

[23]  Juan Fernández-Recio,et al.  SKEMPI 2.0: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation , 2018, bioRxiv.

[24]  Brian D. Weitzner,et al.  RosettaAntibodyDesign (RAbD): A general framework for computational antibody design , 2017, bioRxiv.

[25]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[26]  M. Penichet,et al.  Progress and Challenges in the Design and Clinical Development of Antibodies for Cancer Therapy , 2018, Front. Immunol..

[27]  Johannes Söding,et al.  MMseqs2: sensitive protein sequence searching for the analysis of massive data sets , 2017, bioRxiv.

[28]  Dima Kozakov,et al.  The ClusPro web server for protein–protein docking , 2017, Nature Protocols.

[29]  Björn Wallner,et al.  DockQ: A Quality Measure for Protein-Protein Docking Models , 2016, PloS one.

[30]  Peter M Tessier,et al.  Advances in Antibody Design. , 2015, Annual review of biomedical engineering.

[31]  Costas D. Maranas,et al.  OptMAVEn – A New Framework for the de novo Design of Antibody Variable Region Models Targeting Specific Antigen Epitopes , 2014, PloS one.

[32]  Jiye Shi,et al.  SAbDab: the structural antibody database , 2013, Nucleic Acids Res..

[33]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[34]  Tongqing Zhou,et al.  Somatic Mutations of the Immunoglobulin Framework Are Generally Required for Broad and Potent HIV-1 Neutralization , 2013, Cell.

[35]  Haruki Nakamura,et al.  Computer-aided antibody design , 2012, Protein engineering, design & selection : PEDS.

[36]  B. Mumey,et al.  Antigen-antibody interface properties: composition, residue interactions, and features of 53 non-redundant structures. , 2012, Biochimica et biophysica acta.

[37]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[38]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[39]  P. Carter Potent antibody therapeutics by design , 2006, Nature Reviews Immunology.

[40]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[41]  V. Giudicelli,et al.  IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains. , 2003, Developmental and comparative immunology.

[42]  Sarah A. Teichmann,et al.  Principles of protein-protein interactions , 2002, ECCB.

[43]  Alexander D. MacKerell,et al.  CHARMM: The Energy Function and Its Parameterization , 2002 .

[44]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[45]  Andrew J. Martin,et al.  Antibody-antigen interactions: contact analysis and binding site topography. , 1996, Journal of molecular biology.

[46]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[47]  G. Winter,et al.  Antibody framework residues affecting the conformation of the hypervariable loops. , 1992, Journal of molecular biology.

[48]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[49]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[50]  Iupaciubcommissiononbiochemic IUPAC-IUB Commission on Biochemical Nomenclature. Abbreviations and symbols for the description of the conformation of polypeptide chains. , 1971, Journal of molecular biology.