High-throughput structural modeling of the HIV transmission bottleneck

After three decades of research on human immunodeficiency virus (HIV), the causative agent of acquired immunodeficiency syndrome (AIDS), a vaccine has yet to be discovered. Most theoretical and experimental work on HIV vaccines has focused on the relevant molecular interactions at systemic pH levels, but HIV is typically transmitted sexually at mucosal pH levels. We previously developed a computational approach for calculating pH-sensitivity which predicted optimal transmission at mucosal pH levels, and was validated by experimental electrophoretic measurements and envelope protein binding assays. We have recently augmented this approach using a unique combination of protein dynamical modeling, parallel computation, and data compression tools which enable high-throughput calculations. The resulting fully-automated pipeline was capable of predicting pH sensitivity for a recent study involving more than 250 unique HIV envelope proteins utilizing approximately 1 million individual electrostatic surface calculations. We provide strong evidence that supports the previous hypothesis of a computational approach to determining the pH sensitivity of HIV envelopes. Furthermore, a PCA-based indexing method is proposed that allows for a comparison of biomolecular structures in terms of electrostatic pH sensitivity. We utilize the results to predict highly transmissible HIV variants with implications for vaccine design and efficacy.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  Tanmoy Bhattacharya,et al.  Role of donor genital tract HIV-1 diversity in the transmission bottleneck , 2011, Proceedings of the National Academy of Sciences.

[3]  Nathan A. Baker,et al.  Electrostatics of nanosystems: Application to microtubules and the ribosome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[5]  Deborah L. Bandalos,et al.  Four Common Misconceptions in Exploratory Factor Analysis , 2008 .

[6]  R. Dror,et al.  Improved side-chain torsion potentials for the Amber ff99SB protein force field , 2010, Proteins.

[7]  David C. Nickle,et al.  HIV-Specific Probabilistic Models of Protein Evolution , 2007, PloS one.

[8]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[9]  Jan H. Jensen,et al.  PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. , 2011, Journal of chemical theory and computation.

[10]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[11]  Johann Bauer,et al.  Electrophoresis of cells and the biological relevance of surface charge , 2002, Electrophoresis.

[12]  Don C. Wiley,et al.  Structure of an unliganded simian immunodeficiency virus gp120 core , 2005, Nature.

[13]  Daniel W. Farrell,et al.  Generating stereochemically acceptable protein pathways , 2010, Proteins.

[14]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[15]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[16]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[17]  D. Fisher,et al.  The electrophoretic mobility of micro-organisms. , 1973, Advances in microbial physiology.

[18]  D. van der Spoel,et al.  GROMACS: A message-passing parallel molecular dynamics implementation , 1995 .

[19]  Gerhard Klebe,et al.  PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations , 2007, Nucleic Acids Res..

[20]  Berk Hess,et al.  GROMACS 3.0: a package for molecular simulation and trajectory analysis , 2001 .

[21]  Peter Lindstrom,et al.  Fixed-Rate Compressed Floating-Point Arrays , 2014, IEEE Transactions on Visualization and Computer Graphics.

[22]  Joshua L. Phillips,et al.  Dynamic electrophoretic fingerprinting of the HIV-1 envelope glycoprotein , 2013, Retrovirology.

[23]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[24]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[25]  James Dean Brown,et al.  Statistics Corner Questions and answers about language testing statistics: Choosing the Right Number of Components or Factors in PCA and EFA Choosing the Right Number of Components or Factors in PCA and EFA , 2009 .

[26]  Nathan A. Baker,et al.  PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations , 2004, Nucleic Acids Res..

[27]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[28]  V. Hornak,et al.  Comparison of multiple Amber force fields and development of improved protein backbone parameters , 2006, Proteins.