Characterizing the Relationship Between HIV‐1 Genotype and Phenotype: Prediction‐Based Classification

This paper establishes a framework for understanding the complex relationships between HIV-1 genotypic markers of resistance to antiretroviral drugs and clinical measures of disease progression. A new classification scheme based on the probabilities of how new patients will respond to antiretroviral therapy given the available data is proposed as a method for distinguishing among groups of viral sequences. This approach draws from existing cluster analysis, discriminant analysis, and recursive partitioning techniques and requires a model relating genotypic characteristics to phenotypic response. A data set of 2,746 sequences and the corresponding Indinavir 50% inhibitory concentrations are described and used for illustrative purposes.

[1]  Burton H. Singer,et al.  Recursive partitioning in the health sciences , 1999 .

[2]  J. Schapiro,et al.  Methods for investigation of the relationship between drug-susceptibility phenotype and human immunodeficiency virus type 1 genotype with applications to AIDS clinical trials group 333. , 2000, The Journal of infectious diseases.

[3]  Jianping Ding,et al.  Locations of anti-AIDS drug binding sites and resistance mutations in the three-dimensional structure of HIV-1 reverse transcriptase. Implications for mechanisms of drug inhibition and resistance. , 1994, Journal of molecular biology.

[4]  J. Friedman Multivariate adaptive regression splines , 1990 .

[5]  D. Ho,et al.  Antiviral and resistance studies of AG1343, an orally bioavailable inhibitor of human immunodeficiency virus protease , 1996, Antimicrobial agents and chemotherapy.

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[8]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  A. Ciampi,et al.  Stratification by stepwise regression, correspondence analysis and recursive partition: A comparison of three methods of analysis for survival data with covaria , 1986 .

[10]  Bc Haimson,et al.  A Simple Method for Estimating In Situ Stresses at Great Depths , 1974 .

[11]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[12]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[13]  J. Erickson,et al.  Structural mechanisms of HIV drug resistance. , 1996, Annual review of pharmacology and toxicology.

[14]  R. Gray,et al.  Vector quantization: clustering and classification trees , 1994 .

[15]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[16]  E. C. Holmes Human immunodeficiency virus, DNA and statistics , 1998 .

[17]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[18]  M. Segal,et al.  Relating Amino Acid Sequence to Phenotype: Analysis of Peptide‐Binding Data , 2000, Biometrics.

[19]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[21]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[22]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[23]  D. Bates,et al.  Newton-Raphson and EM Algorithms for Linear Mixed-Effects Models for Repeated-Measures Data , 1988 .

[24]  M Davidian,et al.  Linear Mixed Models with Flexible Distributions of Random Effects for Longitudinal Data , 2001, Biometrics.

[25]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[26]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .