Integrating linear optimization with structural modeling to increase HIV neutralization breadth

Computational protein design has been successful in modeling fixed backbone proteins in a single conformation. However, when modeling large ensembles of flexible proteins, current methods in protein design have been insufficient. Large barriers in the energy landscape are difficult to traverse while redesigning a protein sequence, and as a result current design methods only sample a fraction of available sequence space. We propose a new computational approach that combines traditional structure-based modeling using the Rosetta software suite with machine learning and integer linear programming to overcome limitations in the Rosetta sampling methods. We demonstrate the effectiveness of this method, which we call BROAD, by benchmarking the performance on increasing predicted breadth of anti-HIV antibodies. We use this novel method to increase predicted breadth of naturally-occurring antibody VRC23 against a panel of 180 divergent HIV viral strains and achieve 100% predicted binding against the panel. In addition, we compare the performance of this method to state-of-the-art multistate design in Rosetta and show that we can outperform the existing method significantly. We further demonstrate that sequences recovered by this method recover known binding motifs of broadly neutralizing anti-HIV antibodies. Finally, our approach is general and can be extended easily to other protein systems. Although our modeled antibodies were not tested in vitro, we predict that these variants would have greatly increased breadth compared to the wild-type antibody.

[1]  A. Sandelin,et al.  Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics. , 2004, Journal of molecular biology.

[2]  Chris Bailey-Kellogg,et al.  Learning Sequence Determinants of Protein: Protein Interaction Specificity with Sparse Graphical Models , 2015, J. Comput. Biol..

[3]  Timothy A. Whitehead,et al.  Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin , 2011, Science.

[4]  Roberto A Chica,et al.  Improving the accuracy of protein stability predictions with multistate design using a variety of backbone ensembles , 2014, Proteins.

[5]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[6]  Young Do Kwon,et al.  Residue-Level Prediction of HIV-1 Antibody Epitopes Based on Neutralization of Diverse Viral Strains , 2013, Journal of Virology.

[7]  Samuel L. DeLuca,et al.  Human Germline Antibody Gene Segments Encode Polyspecific Antibodies , 2013, PLoS Comput. Biol..

[8]  Andrew Leaver-Fay,et al.  Resource Computationally Designed Bispecific Antibodies using Negative State Repertoires Graphical Abstract Highlights , 2016 .

[9]  Jens Meiler,et al.  Redesigned HIV antibodies exhibit enhanced neutralizing potency and breadth. , 2015, The Journal of clinical investigation.

[10]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[11]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[12]  Jack Snoeyink,et al.  Scientific benchmarks for guiding macromolecular energy function improvement. , 2013, Methods in enzymology.

[13]  Jens Meiler,et al.  Protocols for Molecular Modeling with Rosetta3 and RosettaScripts , 2016, Biochemistry.

[14]  Pham Phung,et al.  Broad and Potent Neutralizing Antibodies from an African Donor Reveal a New HIV-1 Vaccine Target , 2009, Science.

[15]  Tongqing Zhou,et al.  Structural Basis for Broad and Potent Neutralization of HIV-1 by Antibody VRC01 , 2010, Science.

[16]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[17]  Colin A. Smith,et al.  Backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction. , 2008, Journal of molecular biology.

[18]  Jens Meiler,et al.  Design of Protein Multi-specificity Using an Independent Sequence Search Reduces the Barrier to Low Energy Sequences , 2015, PLoS Comput. Biol..

[19]  D. Baker,et al.  Computational Design of Self-Assembling Protein Nanomaterials with Atomic Level Accuracy , 2012, Science.

[20]  P. Harbury,et al.  Automated design of specificity in molecular recognition , 2003, Nature Structural Biology.

[21]  Tongqing Zhou,et al.  Delineating Antibody Recognition in Polyclonal Sera from Patterns of HIV-1 Isolate Neutralization , 2013, Science.

[22]  Tongqing Zhou,et al.  Somatic Mutations of the Immunoglobulin Framework Are Generally Required for Broad and Potent HIV-1 Neutralization , 2013, Cell.

[23]  Alex Nisthal,et al.  Experimental library screening demonstrates the successful application of computational protein design to large structural ensembles , 2010, Proceedings of the National Academy of Sciences.

[24]  Julia M. Shifman,et al.  Modulating calmodulin binding specificity through computational protein design. , 2002, Journal of molecular biology.

[25]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[26]  Andrew Leaver-Fay,et al.  Generation of bispecific IgG antibodies by structure-based design of an orthogonal Fab interface , 2014, Nature Biotechnology.

[27]  D. Baker,et al.  Elicitation of structure-specific antibodies by epitope scaffolds , 2010, Proceedings of the National Academy of Sciences.

[28]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[29]  Doyle P. Bean,et al.  Understanding thermal adaptation of enzymes through the multistate rational design and stability prediction of 100 adenylate kinases. , 2014, Structure.

[30]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[31]  Young Do Kwon,et al.  Multidonor analysis reveals structural elements, genetic determinants, and maturation pathway for HIV-1 neutralization by VRC01-class antibodies. , 2013, Immunity.

[32]  David Baker,et al.  Computational design of trimeric influenza neutralizing proteins targeting the hemagglutinin receptor binding site , 2017, Nature Biotechnology.

[33]  Baoshan Zhang,et al.  Broad and potent neutralization of HIV-1 by a gp41-specific human antibody , 2012, Nature.

[34]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[35]  Ron Diskin,et al.  Sequence and Structural Convergence of Broad and Potent HIV Antibodies That Mimic CD4 Binding , 2011, Science.

[36]  Andrew Leaver-Fay,et al.  A Generic Program for Multistate Protein Design , 2011, PloS one.

[37]  Brian Kuhlman,et al.  Engineering an improved light-induced dimer (iLID) for controlling the localization and activity of signaling proteins , 2014, Proceedings of the National Academy of Sciences.

[38]  David Baker,et al.  Proof of principle for epitope-focused vaccine design , 2014, Nature.

[39]  Young Do Kwon,et al.  Maturation and Diversity of the VRC01-Antibody Lineage over 15 Years of Chronic HIV-1 Infection , 2015, Cell.

[40]  Mario Roederer,et al.  Rational Design of Envelope Identifies Broadly Neutralizing Human Monoclonal Antibodies to HIV-1 , 2010, Science.

[41]  David Nemazee,et al.  Rational immunogen design to target specific germline B cell receptors , 2012, Retrovirology.

[42]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[43]  Samuel L. DeLuca,et al.  Small-molecule ligand docking into comparative models with Rosetta , 2013, Nature Protocols.

[44]  Ron Diskin,et al.  Increasing the Potency and Breadth of an HIV Antibody by Using Structure-Based Rational Design , 2011, Science.