A Comparison of Three Computational Modelling Methods for the Prediction of Virological Response to Combination Hiv Therapy Author's Personal Copy

OBJECTIVE HIV treatment failure is commonly associated with drug resistance and the selection of a new regimen is often guided by genotypic resistance testing. The interpretation of complex genotypic data poses a major challenge. We have developed artificial neural network (ANN) models that predict virological response to therapy from HIV genotype and other clinical information. Here we compare the accuracy of ANN with alternative modelling methodologies, random forests (RF) and support vector machines (SVM). METHODS Data from 1204 treatment change episodes (TCEs) were identified from the HIV Resistance Response Database Initiative (RDI) database and partitioned at random into a training set of 1154 and a test set of 50. The training set was then partitioned using an L-cross (L=10 in this study) validation scheme for training individual computational models. Seventy six input variables were used for training the models: 55 baseline genotype mutations; the 14 potential drugs in the new treatment regimen; four treatment history variables; baseline viral load; CD4 count and time to follow-up viral load. The output variable was follow-up viral load. Performance was evaluated in terms of the correlations and absolute differences between the individual models' predictions and the actual DeltaVL values. RESULTS The correlations (r(2)) between predicted and actual DeltaVL varied from 0.318 to 0.546 for ANN, 0.590 to 0.751 for RF and 0.300 to 0.720 for SVM. The mean absolute differences varied from 0.677 to 0.903 for ANN, 0.494 to 0.644 for RF and 0.500 to 0.790 for SVM. ANN models were significantly inferior to RF and SVM models. The predictions of the ANN, RF and SVM committees all correlated highly significantly with the actual DeltaVL of the independent test TCEs, producing r(2) values of 0.689, 0.707 and 0.620, respectively. The mean absolute differences were 0.543, 0.600 and 0.607log(10)copies/ml for ANN, RF and SVM, respectively. There were no statistically significant differences between the three committees. Combining the committees' outputs improved correlations between predicted and actual virological responses. The combination of all three committees gave a correlation of r(2)=0.728. The mean absolute differences followed a similar pattern. CONCLUSIONS RF and SVM models can produce predictions of virological response to HIV treatment that are comparable in accuracy to a committee of ANN models. Combining the predictions of different models improves their accuracy somewhat. This approach has potential as a future clinical tool and a combination of ANN and RF models is being taken forward for clinical evaluation.

[1]  J. Listgarten,et al.  Predictive Models for Breast Cancer Susceptibility from Multiple Single Nucleotide Polymorphisms , 2004, Clinical Cancer Research.

[2]  C. Tinelli,et al.  Comparison between rules-based human immunodeficiency virus type 1 genotype interpretations and real or virtual phenotype: concordance analysis and correlation with clinical outcome in heavily treated patients. , 2003, The Journal of infectious diseases.

[3]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[4]  Tom Heskes,et al.  Bias/Variance Decompositions for Likelihood-Based Estimators , 1998, Neural Computation.

[5]  Brendan Larder,et al.  The Development of Artificial Neural Networks to Predict Virological response to Combination HIV Therapy , 2007, Antiviral therapy.

[6]  John Shawe-Taylor,et al.  Generalization Performance of Support Vector Machines and Other Pattern Classifiers , 1999 .

[7]  B. Larder,et al.  Enhanced prediction of lopinavir resistance from genotype by use of artificial neural networks. , 2003, The Journal of infectious diseases.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  K. Hertogs,et al.  HIV Drug Susceptibility and Treatment Response to Mega-Haart Regimen in Patients from the Frankfurt HIV Cohort , 2000, Antiviral therapy.

[10]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[11]  Thomas Lengauer,et al.  Diversity and complexity of HIV-1 drug resistance: A bioinformatics approach to predicting phenotype from genotype , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12]  S. Hammer,et al.  Antiretroviral drug resistance testing in adult HIV-1 infection: recommendations of an International AIDS Society-USA Panel. , 2000, JAMA.

[13]  JD Lundgren,et al.  Updated European Recommendations for the Clinical Use of HIV Drug Resistance Testing , 2004, Antiviral therapy.

[14]  R. Samudrala,et al.  Simple Linear Model Provides Highly Accurate Genotypic Predictions of HIV-1 Drug Resistance , 2003, Antiviral therapy.

[15]  D. Katzenstein,et al.  Weighted phenotypic susceptibility scores are predictive of the HIV-1 RNA response in protease inhibitor-experienced HIV-1-infected subjects. , 2004, The Journal of infectious diseases.

[16]  V. Soriano,et al.  Correlation between rules-based interpretation and virtual phenotype interpretation of HIV-1 genotypes for predicting drug resistance in HIV-infected individuals. , 2004, Journal of virological methods.

[17]  Thomas Lengauer,et al.  Data and text mining Computational methods for the design of effective therapies against drug resistant HIV strains , 2005 .

[18]  L. Sandvik,et al.  An algorithm‐based genotypic resistance score is associated with clinical outcome in HIV‐1‐infected adults on antiretroviral therapy , 2004, HIV medicine.

[19]  Durga L. Shrestha,et al.  Machine learning approaches for estimation of prediction interval for the model output , 2006, Neural Networks.

[20]  P. Narciso,et al.  Variable prediction of antiretroviral treatment outcome by different systems for interpreting genotypic human immunodeficiency virus type 1 drug resistance. , 2003, Journal of Infectious Diseases.

[21]  J. J. Henning,et al.  Guidelines for the Use of Antiretroviral Agents in HIV-Infected Adults and Adolescents, January 28, 2000 , 1998, HIV clinical trials.

[22]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[23]  Thomas Lengauer,et al.  Geno2pheno: estimating phenotypic drug resistance from HIV-1 genotypes , 2003, Nucleic Acids Res..

[24]  W Preiser,et al.  Variety of interpretation systems for human immunodeficiency virus type 1 genotyping: confirmatory information or additional confusion? , 2003, Current drug targets. Infectious disorders.

[25]  S. Hammer,et al.  Antiretroviral drug resistance testing in adult HIV-1 infection: 2008 recommendations of an International AIDS Society-USA panel. , 2008, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[26]  Hannu Toivonen,et al.  A survey of data mining methods for linkage disequilibrium mapping , 2006, Human Genomics.

[27]  S. Staszewski,et al.  Comparison of Nine Resistance Interpretation Systems for HIV-1 Genotyping , 2002, Antiviral therapy.

[28]  Victor DeGruttola,et al.  Clinically Validated Genotype Analysis: Guiding Principles and Statistical Concerns , 2004, Antiviral therapy.