HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees

The determination of HIV-1 coreceptor usage plays a major role in HIV treatment. Since Maraviroc has been used in a treatment for patients those exclusively harbor R5-tropic strains, the efficient performance of classifying HIV-1 coreceptor usage can help choose the most advantaged HIV treatment. In general, HIV-1 variants are classified as R5-tropic and X4-tropic or dual/mixed tropic based on their coreceptor usages. The classification of the coreceptor usage has been developed by using the various computational methods or genotypic algorithms based on V3 amino acid sequences. Most genotypic tools have been designed based on a data set of the HIV-1 subtype B that seemed to be reliable only for this subtype. However, the performance of these tools decreases in non-B subtypes. In this study, the support vector machine (SVM) method has been used to classify the HIV-1 coreceptor. To develop an efficient SVM classifier, we present a feature selector using the logistic model tree (LMT) method to select the most relevant positions from the V3 amino acid sequences. Our approach achieves as high as 97.8% accuracy, 97.7% specificity, and 97.9% sensitivity measured by ten-fold cross-validation on 273 sequences.

[1]  R. Swanstrom,et al.  Improved success of phenotype prediction of the human immunodeficiency virus type 1 from envelope variable loop 3 sequence using neural networks. , 2001, Virology.

[2]  A. Wensing,et al.  European guidelines on the clinical management of HIV-1 tropism testing. , 2011, The Lancet. Infectious diseases.

[3]  Paul E. Kennedy,et al.  HIV-1 Entry Cofactor: Functional cDNA Cloning of a Seven-Transmembrane, G Protein-Coupled Receptor , 1996, Science.

[4]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[5]  Jiri Vohradsky,et al.  A combination of kernel methods and genetic programming for gene expression pattern classification , 2006, 2006 International Conference onResearch, Innovation and Vision for the Future.

[6]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[7]  Christophe Pasquier,et al.  Population-Based Sequencing of the V3 Region of env for Predicting the Coreceptor Usage of Human Immunodeficiency Virus Type 1 Quasispecies , 2007, Journal of Clinical Microbiology.

[8]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[9]  D. Richman,et al.  The impact of the syncytium-inducing phenotype of human immunodeficiency virus on disease progression. , 1994, The Journal of infectious diseases.

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  George W. Irwin,et al.  Two-stage gene selection for support vector machine classification of microarray data , 2009, Int. J. Model. Identif. Control..

[12]  J. Margolick,et al.  Improved Coreceptor Usage Prediction and GenotypicMonitoring of R5-to-X4 Transition by Motif Analysis of HumanImmunodeficiency Virus Type 1 env V3 LoopSequences , 2003, Journal of Virology.

[13]  I. Keet,et al.  Prognostic Value of HIV-1 Syncytium-Inducing Phenotype for Rate of CD4+ Cell Depletion and Progression to AIDS , 1993, Annals of Internal Medicine.

[14]  Andrew J. Low,et al.  Predicting HIV Coreceptor Usage on the Basis of Genetic and Clinical Covariates , 2007, Antiviral therapy.

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  Insuk Sohn,et al.  Classification of gene functions using support vector machine for time-course gene expression data , 2008, Comput. Stat. Data Anal..

[17]  Natalia Chueca,et al.  Evaluation of Eight Different Bioinformatics Tools To Predict Viral Tropism in Different Human Immunodeficiency Virus Type 1 Subtypes , 2008, Journal of Clinical Microbiology.

[18]  B. Cullen,et al.  Identification of the envelope V3 loop as the primary determinant of cell tropism in HIV-1. , 1991, Science.

[19]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[20]  H. Schuitemaker,et al.  Phenotype-associated sequence variation in the third variable domain of the human immunodeficiency virus type 1 gp120 molecule , 1992, Journal of virology.

[21]  Jacques Corbeil,et al.  A new perspective on V3 phenotype prediction. , 2003, AIDS research and human retroviruses.

[22]  Helen Piontkivska,et al.  HIV type 1 tropism and inhibitors of viral entry: clinical implications. , 2006, AIDS reviews.

[23]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[24]  B. Korber,et al.  A new classification for HIV-1 , 1998, Nature.

[25]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .