Learning from Past Treatments and Their Outcome Improves Prediction of In Vivo Response to Anti-HIV Therapy

Infections with the human immunodeficiency virus type 1 (HIV-1) are treated with combinations of drugs. Unfortunately, HIV responds to the treatment by developing resistance mutations. Consequently, the genome of the viral target proteins is sequenced and inspected for resistance mutations as part of routine diagnostic procedures for ensuring an effective treatment. For predicting response to a combination therapy, currently available computer-based methods rely on the genotype of the virus and the composition of the regimen as input. However, no available tool takes full advantage of the knowledge about the order of and the response to previously prescribed regimens. The resulting high-dimensional feature space makes existing methods difficult to apply in a straightforward fashion. The machine learning system proposed in this work, sequence boosting, is tailored to exploiting such high-dimensional information, i.e. the extraction of longitudinal features, by utilizing the recent advancements in data mining and boosting. When applied to predicting the latest treatment outcome for 3,759 treatment-experienced patients from the EuResist integrated database, sequence boosting achieved superior performance compared to SVMs with RBF kernels. Moreover, sequence boosting allows an easy access to the discriminative treatment information. Analysis of feature importance values provided by our model confirmed known facts regarding HIV treatment. For instance, application of potent and recently licensed drugs was beneficial for patients, and, conversely, the patient group that was subject to NRTI mono-therapies in the past had poor treatment perspectives today. Furthermore, our model revealed novel biological insights. More precisely, the combination of previously used drugs with their in vivo response is more informative than the information of previously used drugs alone. Using this information improves the performance of systems for predicting therapy outcome.

[1]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[2]  Yves Moreau,et al.  Analysis of HIV-1 pol sequences using Bayesian Networks: implications for drug resistance , 2006, Bioinform..

[3]  Thomas Lengauer,et al.  Improved Prediction of Response to Antiretroviral Combination Therapy using the Genetic Barrier to Drug Resistance , 2006, Antiviral therapy.

[4]  Thomas Lengauer,et al.  Predicting the response to combination antiretroviral therapy: retrospective validation of geno2pheno-THEO on a large clinical database. , 2009, The Journal of infectious diseases.

[5]  B Wahren,et al.  Treatment history and baseline viral load, but not viral tropism or CCR‐5 genotype, influence prolonged antiviral efficacy of highly active antiretroviral treatment , 1998, AIDS.

[6]  Albert D. Shieh,et al.  Statistical Applications in Genetics and Molecular Biology , 2010 .

[7]  A. Foulkes,et al.  Characterizing the Relationship Between HIV‐1 Genotype and Phenotype: Prediction‐Based Classification , 2002, Biometrics.

[8]  Thomas Lengauer,et al.  Selecting anti-HIV therapies based on a variety of genomic and clinical factors , 2008, ISMB.

[9]  Matthew Rabinowitz,et al.  Accurate prediction of HIV-1 drug response from the reverse transcriptase and protease amino acid sequences using sparse models created by convex optimization , 2006, Bioinform..

[10]  Yves Moreau,et al.  Modelled in vivo HIV Fitness under drug Selective Pressure and Estimated Genetic Barrier Towards Resistance are Predictive for Virological Response , 2008, Antiviral therapy.

[11]  Shinichi Morishita Computing Optimal Hypotheses Efficiently for Boosting , 2002, Progress in Discovery Science.

[12]  R. B. Swenson,et al.  Isolation of a T-lymphotropic retrovirus from naturally infected sooty mangabey monkeys (Cercocebus atys). , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Andrea De Luca,et al.  Cross-resistance among nonnucleoside reverse transcriptase inhibitors limits recycling efavirenz after nevirapine failure. , 2002, AIDS research and human retroviruses.

[14]  Niko Beerenwinkel,et al.  Mutagenetic tree Fisher kernel improves prediction of HIV drug resistance from viral genotype , 2006, NIPS.

[15]  B. Larder,et al.  Enhanced prediction of lopinavir resistance from genotype by use of artificial neural networks. , 2003, The Journal of infectious diseases.

[16]  Takeaki Uno,et al.  Mining complex genotypic features for predicting HIV-1 drug resistance , 2007, Bioinform..

[17]  V. De Gruttola,et al.  Characterizing the Progression of Viral Mutations Over Time , 2003 .

[18]  Thomas B. Kepler,et al.  Unselected Mutations in the Human Immunodeficiency Virus Type 1 Genome Are Mostly Nonsynonymous and Often Deleterious , 2004, Journal of Virology.

[19]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[20]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[21]  J. Chermann,et al.  Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immune deficiency syndrome (AIDS). , 1983, Science.

[22]  Thomas Lengauer,et al.  Multi-task learning for HIV therapy screening , 2008, ICML '08.

[23]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[24]  D. Richman,et al.  UpdateoftheDrugResistanceMutationsinHIV-1: Spring 2008 , 2008 .

[25]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[26]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[27]  Thomas Lengauer,et al.  Geno2pheno: estimating phenotypic drug resistance from HIV-1 genotypes , 2003, Nucleic Acids Res..

[28]  R. Shafer,et al.  Genotypic predictors of human immunodeficiency virus type 1 drug resistance , 2006, Proceedings of the National Academy of Sciences.

[29]  Francesca Ceccherini-Silberstein,et al.  Historical resistance profile helps to predict salvage failure , 2009, Antiviral therapy.

[30]  Thomas Lengauer,et al.  Diversity and complexity of HIV-1 drug resistance: A bioinformatics approach to predicting phenotype from genotype , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[32]  Sebastian Nowozin,et al.  Discriminative Subsequence Mining for Action Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[33]  Thomas Lengauer,et al.  Characterization of Novel HIV Drug Resistance Mutations Using Clustering, Multidimensional Scaling and SVM-Based Feature Ranking , 2005, PKDD.

[34]  Huldrych F Günthard,et al.  Update of the drug resistance mutations in HIV-1: Spring 2008. , 2008, Topics in HIV medicine : a publication of the International AIDS Society, USA.

[35]  Carolyn Pillers Dobler Mathematical Statistics: Basic Ideas and Selected Topics , 2002 .

[36]  N. Ghoraf Reliability formula & limit law of the failure time of “m-consecutive-k-out-of-n:F system” , 2008 .

[37]  Brendan Larder,et al.  The Development of Artificial Neural Networks to Predict Virological response to Combination HIV Therapy , 2007, Antiviral therapy.