SCOTCH: subtype A coreceptor tropism classification in HIV‐1

Motivation: The V3 loop of the gp120 glycoprotein of the Human Immunodeficiency Virus 1 (HIV‐1) is considered to be responsible for viral coreceptor tropism. gp120 interacts with the CD4 receptor of the host cell and subsequently V3 binds either CCR5 or CXCR4. Due to the fact that the CCR5 coreceptor is targeted by entry inhibitors, a reliable prediction of the coreceptor usage of HIV‐1 is of great interest for antiretroviral therapy. Although several methods for the prediction of coreceptor tropism are available, almost all of them have been developed based on only subtype B sequences, and it has been shown in several studies that the prediction of non‐B sequences, in particular subtype A sequences, are less reliable. Thus, the aim of the current study was to develop a reliable prediction model for subtype A viruses. Results: Our new model SCOTCH is based on a stacking approach of classifier ensembles and shows a significantly better performance for subtype A sequences compared to other available models. In particular for low false positive rates (between 0.05 and 0.2, i.e. recommendation in the German and European Guidelines for tropism prediction), SCOTCH shows significantly better prediction performances in terms of partial area under the curves and diagnostic odds ratios compared to existing tools, and thus can be used to reliably predict coreceptor tropism for subtype A sequences. Availability and implementation: SCOTCH can be downloaded/accessed at http://www.heiderlab.de.

[1]  Dominik Heider,et al.  A simple structure-based model for the prediction of HIV-1 co-receptor tropism , 2014, BioData Mining.

[2]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[3]  M. Cho,et al.  Identification of determinants of interaction between CXCR4 and gp120 of a dual-tropic HIV-1DH12 isolate. , 1999, Virology.

[4]  P. Bossuyt,et al.  The diagnostic odds ratio: a single indicator of test performance. , 2003, Journal of clinical epidemiology.

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  J. Izopet,et al.  Phenotyping methods for determining HIV tropism and applications in clinical settings , 2012, Current opinion in HIV and AIDS.

[7]  Nathan A. Baker,et al.  Electrostatics of nanosystems: Application to microtubules and the ribosome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Thomas Lengauer,et al.  Bioinformatics prediction of HIV coreceptor usage , 2007, Nature Biotechnology.

[9]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[10]  B. Ogutu,et al.  Partial HIV C2V3 envelope sequence analysis reveals association of coreceptor tropism, envelope glycosylation and viral genotypic variability among Kenyan patients on HAART , 2017, Virology Journal.

[11]  David A. Price,et al.  Maraviroc (UK-427,857), a Potent, Orally Bioavailable, and Selective Small-Molecule Inhibitor of Chemokine Receptor CCR5 with Broad-Spectrum Anti-Human Immunodeficiency Virus Type 1 Activity , 2005, Antimicrobial Agents and Chemotherapy.

[12]  O. Gascuel,et al.  SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building. , 2010, Molecular biology and evolution.

[13]  L. Ratner,et al.  Human Immunodeficiency Virus Type 1 Coreceptor Switching: V1/V2 Gain-of-Fitness Mutations Compensate for V3 Loss-of-Fitness Mutations , 2006, Journal of Virology.

[14]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[15]  Thomas Lengauer,et al.  Analysis of Physicochemical and Structural Properties Determining HIV-1 Coreceptor Usage , 2013, PLoS Comput. Biol..

[16]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[17]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[18]  Dominik Heider,et al.  gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing , 2014, Bioinform..

[19]  I. Cohen,et al.  Vaccination against autoimmune mouse diabetes with a T-cell epitope of the human 65-kDa heat shock protein. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[20]  H. Schuitemaker,et al.  Phenotype-associated sequence variation in the third variable domain of the human immunodeficiency virus type 1 gp120 molecule , 1992, Journal of virology.

[21]  Christos J. Petropoulos,et al.  Development and Characterization of a Novel Single-Cycle Recombinant-Virus Assay To Determine Human Immunodeficiency Virus Type 1 Coreceptor Tropism , 2006, Antimicrobial Agents and Chemotherapy.

[22]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[23]  Dominik Heider,et al.  Structure of HIV-1 quasi-species as early indicator for switches of co-receptor tropism , 2010, AIDS research and therapy.

[24]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[25]  Dominik Heider,et al.  Prediction of Co-Receptor Usage of HIV-1 from Genotype , 2010, PLoS Comput. Biol..

[26]  Thomas Lengauer,et al.  Structural Descriptors of gp120 V3 Loop for the Prediction of HIV-1 Coreceptor Usage , 2007, PLoS Comput. Biol..

[27]  P. Harrigan,et al.  Reliable Genotypic Tropism Tests for the Major HIV-1 Subtypes , 2015, Scientific Reports.

[28]  Leo S. D. Caves,et al.  Bio3d: An R Package , 2022 .

[29]  P. Ghys,et al.  Global trends in molecular epidemiology of HIV-1 during 2000–2007 , 2011, AIDS.

[30]  Dominik Heider,et al.  Interpol: An R package for preprocessing of protein sequences , 2011, BioData Mining.

[31]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[32]  A. Wensing,et al.  European guidelines on the clinical management of HIV-1 tropism testing. , 2011, The Lancet. Infectious diseases.

[33]  Dominik Heider,et al.  Improved Bevirimat resistance prediction by combination of structural and sequence-based classifiers , 2011, BioData Mining.

[34]  F. Månsson,et al.  Frequent CXCR4 tropism of HIV-1 subtype A and CRF02_AG during late-stage disease - indication of an evolving epidemic in West Africa , 2010, Retrovirology.

[35]  Soham Gupta,et al.  Performance of Genotypic Tools for Prediction of Tropism in HIV-1 Subtype C V3 Loop Sequences , 2015, Intervirology.

[36]  J. Margolick,et al.  Improved Coreceptor Usage Prediction and GenotypicMonitoring of R5-to-X4 Transition by Motif Analysis of HumanImmunodeficiency Virus Type 1 env V3 LoopSequences , 2003, Journal of Virology.

[37]  I. Keet,et al.  Prognostic Value of HIV-1 Syncytium-Inducing Phenotype for Rate of CD4+ Cell Depletion and Progression to AIDS , 1993, Annals of Internal Medicine.

[38]  Dorothy M. Lang,et al.  Selection for Human Immunodeficiency Virus Type 1 Envelope Glycosylation Variants with Shorter V1-V2 Loop Sequences Occurs during Transmission of Certain Genetic Subtypes and May Impact Viral RNA Levels , 2005, Journal of Virology.

[39]  Dominik Heider,et al.  Genotypic Prediction of Co-receptor Tropism of HIV-1 Subtypes A and C , 2016, Scientific Reports.

[40]  Nathan A. Baker,et al.  PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations , 2004, Nucleic Acids Res..

[41]  C. Cheng‐Mayer,et al.  Small amino acid changes in the V3 hypervariable region of gp120 can affect the T-cell-line and macrophage tropism of human immunodeficiency virus type 1. , 1992, Proceedings of the National Academy of Sciences of the United States of America.