Hierarchical kernel mixture models for the prediction of AIDS disease progression using HIV structural gp120 profiles

Changes to the glycosylation profile on HIV gp120 can influence viral pathogenesis and alter AIDS disease progression. The characterization of glycosylation differences at the sequence level is inadequate as the placement of carbohydrates is structurally complex. However, no structural framework is available to date for the study of HIV disease progression. In this study, we propose a novel machine-learning based framework for the prediction of AIDS disease progression in three stages (RP, SP, and LTNP) using the HIV structural gp120 profile. This new intelligent framework proves to be accurate and provides an important benchmark for predicting AIDS disease progression computationally. The model is trained using a novel HIV gp120 glycosylation structural profile to detect possible stages of AIDS disease progression for the target sequences of HIV+ individuals. The performance of the proposed model was compared to seven existing different machine-learning models on newly proposed gp120-Benchmark_1 dataset in terms of error-rate (MSE), accuracy (CCI), stability (STD), and complexity (TBM). The novel framework showed better predictive performance with 67.82% CCI, 30.21 MSE, 0.8 STD, and 2.62 TBM on the three stages of AIDS disease progression of 50 HIV+ individuals. This framework is an invaluable bioinformatics tool that will be useful to the clinical assessment of viral pathogenesis.

[1]  E. Berger HIV entry and tropism: the chemokine receptor connection. , 1997, AIDS.

[2]  Eitan Rubin,et al.  Biases and complex patterns in the residues flanking protein N-glycosylation sites. , 2003, Glycobiology.

[3]  S. Blais,et al.  Proximal Glycans Outside of the Epitopes Regulate the Presentation of HIV-1 Envelope gp120 Helper Epitopes1 , 2009, The Journal of Immunology.

[4]  Albert Y. Zomaya,et al.  Improving the performance of DomainDiscovery of protein domain boundary assignment using inter-domain linker index , 2006, BMC Bioinformatics.

[5]  R. Dwek,et al.  Exploiting the defensive sugars of HIV-1 for drug and vaccine design , 2007, Nature.

[6]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[7]  Lianfen Qian,et al.  Regularized Radial Basis Function Networks: Theory and Applications , 2002, Technometrics.

[8]  B. Rodés,et al.  Elite HIV controllers: myth or reality? , 2007, AIDS reviews.

[9]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[10]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[11]  Berger Ea HIV entry and tropism: the chemokine receptor connection. , 1997 .

[12]  J. Nazuno Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[13]  N. Saksena,et al.  Temporal relationship between V1V2 variation, macrophage replication, and coreceptor adaptation during HIV-1 disease progression , 2002, AIDS.

[14]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[15]  W. Paxton,et al.  Intrapatient Alterations in the Human Immunodeficiency Virus Type 1 gp120 V1V2 and V3 Regions Differentially Modulate Coreceptor Usage, Virus Inhibition by CC/CXC Chemokines, Soluble CD4, and the b12 and 2G12 Monoclonal Antibodies , 2004, Journal of Virology.

[16]  John P. Moore,et al.  Will Multiple Coreceptors Need To Be Targeted by Inhibitors of Human Immunodeficiency Virus Type 1 Entry? , 1999, Journal of Virology.

[17]  Thomas G. Dietterich,et al.  Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms , 2008 .

[18]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[19]  Dorothy M. Lang,et al.  Selection for Human Immunodeficiency Virus Type 1 Envelope Glycosylation Variants with Shorter V1-V2 Loop Sequences Occurs during Transmission of Certain Genetic Subtypes and May Impact Viral RNA Levels , 2005, Journal of Virology.

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[21]  L. Kasturi,et al.  The Amino Acid at the X Position of an Asn-X-Ser Sequon Is an Important Determinant of N-Linked Core-glycosylation Efficiency (*) , 1996, The Journal of Biological Chemistry.

[22]  M. Churchill,et al.  Asn 362 in gp120 contributes to enhanced fusogenicity by CCR5-restricted HIV-1 envelope glycoprotein variants from patients with AIDS , 2007, Retrovirology.

[23]  Peter D. Kwong,et al.  The antigenic structure of the HIV gp120 envelope glycoprotein , 1998, Nature.

[24]  Thomas G. Dietterich,et al.  Bioinformatics The Machine Learning Approach 2nd ed. , 2001 .

[25]  Jooyoung Lee,et al.  PPRODO: Prediction of protein domain boundaries using neural networks , 2005, Proteins.

[26]  S. C. Kremer,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[27]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[28]  Simon Parsons,et al.  Bioinformatics: The Machine Learning Approach by P. Baldi and S. Brunak, 2nd edn, MIT Press, 452 pp., $60.00, ISBN 0-262-02506-X , 2004, The Knowledge Engineering Review.

[29]  J. Overbaugh,et al.  Human Immunodeficiency Virus Type 1 V1-V2 Envelope Loop Sequences Expand and Add Glycosylation Sites over the Course of Infection, and These Modifications Affect Antibody Neutralization Sensitivity , 2006, Journal of Virology.

[30]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[31]  Albert Y. Zomaya,et al.  SiteSeek: Post-translational modification analysis using adaptive locality-effective kernel methods and new profiles , 2008, BMC Bioinformatics.

[32]  Sergei L. Kosakovsky Pond,et al.  Evolutionary Interactions between N-Linked Glycosylation Sites in the HIV-1 Envelope , 2006, PLoS Comput. Biol..

[33]  J. Sodroski,et al.  Oligomeric Modeling and Electrostatic Analysis of the gp120 Envelope Glycoprotein of Human Immunodeficiency Virus , 2000, Journal of Virology.

[34]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[35]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[36]  Paul W. H. I. Parren,et al.  Fine Mapping of the Interaction of Neutralizing and Nonneutralizing Monoclonal Antibodies with the CD4 Binding Site of Human Immunodeficiency Virus Type 1 gp120 , 2003, Journal of Virology.

[37]  K. Theys,et al.  HIV-1 gp120 N-linked glycosylation differs between plasma and leukocyte compartments , 2008, Virology Journal.

[38]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[39]  Atsuyuki Okabe,et al.  Spatial Tessellations: Concepts and Applications of Voronoi Diagrams , 1992, Wiley Series in Probability and Mathematical Statistics.

[40]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[41]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[42]  Martin A. Nowak,et al.  Antibody neutralization and escape by HIV-1 , 2003, Nature.

[43]  S. Oka,et al.  In vivo sequence variability of human immunodeficiency virus type 1 envelope gp120: association of V2 extension with slow disease progression , 1997, Journal of virology.

[44]  L. Stamatatos,et al.  V2 Loop Glycosylation of the Human Immunodeficiency Virus Type 1 SF162 Envelope Facilitates Interaction of This Protein with CD4 and CCR5 Receptors and Protects the Virus from Neutralization by Anti-V3 Loop and Anti-CD4 Binding Site Antibodies , 2000, Journal of Virology.

[45]  Gunnar Rätsch,et al.  Support Vector Machines and Kernels for Computational Biology , 2008, PLoS Comput. Biol..

[46]  J. Sodroski,et al.  Structure of an HIV gp120 envelope glycoprotein in complex with the CD4 receptor and a neutralizing human antibody , 1998, Nature.

[47]  Raymond J. Mooney,et al.  Constructing Diverse Classifier Ensembles using Artificial Training Examples , 2003, IJCAI.