Data-intensive analysis of HIV mutations

BackgroundIn this study, clustering was performed using a bitmap representation of HIV reverse transcriptase and protease sequences, to produce an unsupervised classification of HIV sequences. The classification will aid our understanding of the interactions between mutations and drug resistance. 10,229 HIV genomic sequences from the protease and reverse transcriptase regions of the pol gene and antiretroviral resistant related mutations represented in an 82-dimensional binary vector space were analyzed.ResultsA new cluster representation was proposed using an image inspired by microarray data, such that the rows in the image represented the protein sequences from the genotype data and the columns represented presence or absence of mutations in each protein position.The visualization of the clusters showed that some mutations frequently occur together and are probably related to an epistatic phenomenon.ConclusionWe described a methodology based on the application of a pattern recognition algorithm using binary data to suggest clusters of mutations that can easily be discriminated by cluster viewing schemes.

[1]  L. M. Mansky,et al.  Retrovirus mutation rates and their role in genetic variation. , 1998, The Journal of general virology.

[2]  Susan P. Holmes,et al.  Constrained patterns of covariation and clustering of HIV-1 non-nucleoside reverse transcriptase inhibitor resistance mutations , 2010, The Journal of antimicrobial chemotherapy.

[3]  Robert W. Shafer,et al.  Human immunodeficiency virus type 1 reverse transcriptase and protease mutation search engine for queries , 2000, Nature Medicine.

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  Hans-Hermann Bock,et al.  Two-mode clustering methods: astructuredoverview , 2004, Statistical methods in medical research.

[6]  Soo-Yon Rhee,et al.  Non-nucleoside reverse transcriptase inhibitor (NNRTI) cross-resistance: implications for preclinical evaluation of novel NNRTIs and clinical genotypic resistance testing. , 2014, The Journal of antimicrobial chemotherapy.

[7]  J. Molina,et al.  Once-daily atazanavir/ritonavir versus twice-daily lopinavir/ritonavir, each in combination with tenofovir and emtricitabine, for management of antiretroviral-naive HIV-1-infected patients: 48 week efficacy and safety results of the CASTLE study , 2008, The Lancet.

[8]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[9]  D. Richman,et al.  Patterns of resistance mutations selected by treatment of human immunodeficiency virus type 1 infection with zidovudine, didanosine, and nevirapine. , 2000, The Journal of infectious diseases.

[10]  Eyke Hüllermeier,et al.  Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction , 2013, Bioinform..

[11]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[12]  J. Fantini,et al.  Mutation Patterns of the Reverse Transcriptase and Protease Genes in Human Immunodeficiency Virus Type 1-Infected Patients Undergoing Combination Therapy: Survey of 787 Sequences , 1999, Journal of Clinical Microbiology.

[13]  Thomas Lengauer,et al.  Tenofovir Resistance and Resensitization , 2003, Antimicrobial Agents and Chemotherapy.

[14]  J. Louis,et al.  Structural implications of drug‐resistant mutants of HIV‐1 protease: High‐resolution crystal structures of the mutant protease/substrate analogue complexes , 2001, Proteins.

[15]  Rami Kantor,et al.  The Genetic Basis of HIV-1 Resistance to Reverse Transcriptase and Protease Inhibitors. , 2000, AIDS reviews.

[16]  A. Tanuri,et al.  Low accumulation of L90M in protease from subtype F HIV-1 with resistance to protease inhibitors is caused by the L89M polymorphism. , 2005, The Journal of infectious diseases.

[17]  Thomas D. Wu,et al.  Mutation Patterns and Structural Correlates in Human Immunodeficiency Virus Type 1 Protease following Different Protease Inhibitor Treatments , 2003, Journal of Virology.

[18]  T. Silander,et al.  Bayesian network analysis of resistance pathways against HIV-1 protease inhibitors. , 2007, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[19]  F. Ceccherini‐Silberstein,et al.  Characterization of the patterns of drug-resistance mutations in newly diagnosed HIV-1 infected patients naïve to the antiretroviral drugs , 2009, BMC infectious diseases.

[20]  Huldrych F Günthard,et al.  2011 update of the drug resistance mutations in HIV-1. , 2011, Topics in antiviral medicine.

[21]  B. Larder,et al.  Mutations in Retroviral Genes Associated with Drug Resistance , 1996 .

[22]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[23]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[24]  D. Richman,et al.  2022 update of the drug resistance mutations in HIV-1. , 2022, Topics in antiviral medicine.

[25]  Thomas Lengauer,et al.  Characterization of Novel HIV Drug Resistance Mutations Using Clustering, Multidimensional Scaling and SVM-Based Feature Ranking , 2005, PKDD.

[26]  Matthew J. Gonzales,et al.  Distribution of Human Immunodeficiency Virus Type 1 Protease and Reverse Transcriptase Mutation Patterns in 4,183 Persons Undergoing Genotypic Resistance Testing , 2004, Antimicrobial Agents and Chemotherapy.

[27]  Luciano Vieira de Araújo,et al.  HIV drug resistance analysis tool based on process algebra , 2008, SAC '08.

[28]  Ying Liu,et al.  Analysis of correlated mutations in HIV-1 protease using spectral clustering , 2008, Bioinform..

[29]  Bryan Chan,et al.  Human immunodeficiency virus reverse transcriptase and protease sequence database , 2003, Nucleic Acids Res..

[30]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[31]  Thomas D. Wu,et al.  Extended spectrum of HIV-1 reverse transcriptase mutations in patients receiving multiple nucleoside analog inhibitors , 2003, AIDS.

[32]  J. Mellors,et al.  Frequent emergence of N348I in HIV-1 subtype C reverse transcriptase with failure of initial therapy reduces susceptibility to reverse-transcriptase inhibitors. , 2012, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[33]  Roded Sharan,et al.  Biclustering Algorithms: A Survey , 2007 .

[34]  David W. Haas,et al.  HLA-Associated Immune Escape Pathways in HIV-1 Subtype B Gag, Pol and Nef Proteins , 2009, PloS one.

[35]  F. Brun-Vézinet,et al.  A survival method to estimate the time to occurrence of mutations: an application to thymidine analogue mutations in HIV-1-infected patients. , 2004, The Journal of infectious diseases.

[36]  V. Calvez,et al.  Thymidine analogue reverse transcriptase inhibitors resistance mutations profiles and association to other nucleoside reverse transcriptase inhibitors resistance mutations observed in the context of virological failure , 2004, Journal of medical virology.

[37]  Susan P. Holmes,et al.  A multifaceted analysis of HIV-1 protease multidrug resistance phenotypes , 2011, BMC Bioinformatics.

[38]  Celia A Schiffer,et al.  Covariation of amino acid positions in HIV-1 protease. , 2003, Virology.

[39]  Lidia Ruiz,et al.  Prevalence of HIV Protease Mutations on Failure of Nelfinavir-Containing HAART: A Retrospective Analysis of Four Clinical Studies and Two Observational Cohorts , 2002, HIV clinical trials.

[40]  Richard H. Lathrop,et al.  Knowledge-Based Avoidance of Drug-Resistant HIV Mutants , 1998, AI Mag..

[41]  C. Schultsz,et al.  Plasmid-Mediated Resistance in Enterobacteriaceae , 2012, Drugs.

[42]  Robert W. Shafer,et al.  HIV-1 Antiretroviral Resistance , 2012, Drugs.

[43]  J. Fantini,et al.  Mutation L210W of HIV-1 reverse transcriptase in patients receiving combination therapy. Incidence, association with other mutations, and effects on the structure of mutated reverse transcriptase. , 2000, Journal of biomedical science.

[44]  David Heckerman,et al.  Phylogenetic Dependency Networks: Inferring Patterns of CTL Escape and Codon Covariation in HIV-1 Gag , 2008, PLoS Comput. Biol..