Analysis of correlated mutations in HIV-1 protease using spectral clustering

Motivation: The ability of human immunodeficiency virus-1 (HIV-1) protease to develop mutations that confer multi-drug resistance (MDR) has been a major obstacle in designing rational therapies against HIV. Resistance is usually imparted by a cooperative mechanism that can be elucidated by a covariance analysis of sequence data. Identification of such correlated substitutions of amino acids may be obscured by evolutionary noise. Results: HIV-1 protease sequences from patients subjected to different specific treatments (set 1), and from untreated patients (set 2) were subjected to sequence covariance analysis by evaluating the mutual information (MI) between all residue pairs. Spectral clustering of the resulting covariance matrices disclosed two distinctive clusters of correlated residues: the first, observed in set 1 but absent in set 2, contained residues involved in MDR acquisition; and the second, included those residues differentiated in the various HIV-1 protease subtypes, shortly referred to as the phylogenetic cluster. The MDR cluster occupies sites close to the central symmetry axis of the enzyme, which overlap with the global hinge region identified from coarse-grained normal-mode analysis of the enzyme structure. The phylogenetic cluster, on the other hand, occupies solvent-exposed and highly mobile regions. This study demonstrates (i) the possibility of distinguishing between the correlated substitutions resulting from neutral mutations and those induced by MDR upon appropriate clustering analysis of sequence covariance data and (ii) a connection between global dynamics and functional substitution of amino acids. Contact: bahar@ccbb.pitt.edu Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Viktor Hornak,et al.  HIV-1 protease flaps spontaneously open and reclose in molecular dynamics simulations. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Normand M. Laurendeau Statistical Thermodynamics: Normal Mode Analysis , 2005 .

[3]  Rama Ranganathan,et al.  Structural Determinants of Allosteric Ligand Activation in RXR Heterodimers , 2004, Cell.

[4]  C. Sander,et al.  Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? , 1994, Protein engineering.

[5]  Celia A. Schiffer,et al.  Structural Basis for Coevolution of a Human Immunodeficiency Virus Type 1 Nucleocapsid-p1 Cleavage Site with a V82A Drug-Resistant Mutation in Viral Protease , 2004, Journal of Virology.

[6]  J. Mccammon,et al.  HIV‐1 protease molecular dynamics of a wild‐type and of the V82F/I84V mutant: Possible contributions to drug resistance and a potential new target site for drugs , 2004, Protein science : a publication of the Protein Society.

[7]  C. Sander,et al.  The prediction of protein contacts from multiple sequence alignments. , 1996, Protein engineering.

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  C. Sander,et al.  Correlated Mutations and Residue Contacts , 1994 .

[10]  Lynn Morris,et al.  Impact of HIV-1 Subtype and Antiretroviral Therapy on Protease and Reverse Transcriptase Genotype: Results of a Global Collaboration , 2005, PLoS medicine.

[11]  R. Shafer,et al.  Human immunodeficiency virus type 1 reverse-transcriptase and protease subtypes: classification, amino acid mutation patterns, and prevalence in a northern California clinic-based population. , 2001, The Journal of infectious diseases.

[12]  A. Atilgan,et al.  Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. , 1997, Folding & design.

[13]  Hassan A. Karimi,et al.  oGNM: online computation of structural dynamics using the Gaussian Network Model , 2006, Nucleic Acids Res..

[14]  Chakra Chennubhotla,et al.  The Gaussian Network Model: Theory and Applications , 2005 .

[15]  Kevin Reilly,et al.  Evolutionarily Conserved Allosteric Network in the Cys Loop Family of Ligand-gated Ion Channels Revealed by Statistical Covariance Analyses* , 2006, Journal of Biological Chemistry.

[16]  M. Kozal,et al.  Review: Cross-Resistance Patterns Among HIV Protease Inhibitors , 2004 .

[17]  A. Atilgan,et al.  Vibrational Dynamics of Folded Proteins: Significance of Slow and Fast Motions in Relation to Function and Stability , 1998 .

[18]  Ivet Bahar,et al.  Rapid assessment of correlated amino acids from pair-to-pair (P2P) substitution matrices , 2007, Bioinform..

[19]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[20]  Sarel J Fleishman,et al.  An evolutionarily conserved network of amino acids mediates gating in voltage-dependent potassium channels. , 2004, Journal of molecular biology.

[21]  B. Korber,et al.  Signature pattern analysis: a method for assessing viral sequence relatedness. , 1992, AIDS research and human retroviruses.

[22]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[23]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Brendan A. Larder,et al.  Phenotypic and genotypic analysis of clinical HIV-1 isolates reveals extensive protease inhibitor cross-resistance: a survey of over 6000 samples , 2000, AIDS.

[25]  Jitendra Malik,et al.  Normalized Cut and Image Segmentation , 1997 .

[26]  M. Kozal,et al.  Cross-resistance patterns among HIV protease inhibitors. , 2004, AIDS patient care and STDs.

[27]  H. Wolfson,et al.  Correlated mutations: Advances and limitations. A study on fusion proteins and on the Cohesin‐Dockerin families , 2006, Proteins.

[28]  A Maritan,et al.  Molecular dynamics studies on HIV‐1 protease: Drug resistance and folding pathways , 2001, Proteins.

[29]  W. Atchley,et al.  Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis. , 2000, Molecular biology and evolution.

[30]  A. Horovitz,et al.  Mapping pathways of allosteric communication in GroEL by analysis of correlated mutations , 2002, Proteins.

[31]  A. Horovitz,et al.  Detection and reduction of evolutionary noise in correlated mutation analysis. , 2005, Protein engineering, design & selection : PEDS.

[32]  I. Bahar,et al.  Normal mode analysis : theory and applications to biological and chemical systems , 2005 .

[33]  Rama Ranganathan,et al.  Allosteric determinants in guanine nucleotide-binding proteins , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[34]  I. Bahar,et al.  Gaussian Dynamics of Folded Proteins , 1997 .

[35]  N D Clarke,et al.  Covariation of residues in the homeodomain sequence family , 1995, Protein science : a publication of the Protein Society.

[36]  Thomas D. Wu,et al.  Mutation Patterns and Structural Correlates in Human Immunodeficiency Virus Type 1 Protease following Different Protease Inhibitor Treatments , 2003, Journal of Virology.

[37]  I. Bahar,et al.  Coupling between catalytic site and collective dynamics: a requirement for mechanochemical activity of enzymes. , 2005, Structure.

[38]  Robert W. Shafer,et al.  Genotypic Testing for Human Immunodeficiency Virus Type 1 Drug Resistance , 2002, Clinical Microbiology Reviews.

[39]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[40]  Bryan Chan,et al.  Human immunodeficiency virus reverse transcriptase and protease sequence database , 2003, Nucleic Acids Res..

[41]  S. Pietrokovski,et al.  A pair‐to‐pair amino acids substitution matrix and its applications for protein structure prediction , 2007, Proteins.

[42]  R. Aldrich,et al.  Influence of conservation on calculations of amino acid covariance in multiple sequence alignments , 2004, Proteins.

[43]  L. C. Martin,et al.  Using information theory to search for co-evolving residues in proteins , 2005, Bioinform..

[44]  Kevin Karplus,et al.  Contact prediction using mutual information and neural nets , 2007, Proteins.

[45]  E. Freire,et al.  Multidrug resistance to HIV-1 protease inhibition requires cooperative coupling between distal mutations. , 2003, Biochemistry.

[46]  M Karplus,et al.  Relation between sequence and structure of HIV-1 protease inhibitor complexes: a model system for the analysis of protein flexibility. , 2002, Journal of molecular biology.

[47]  Celia A Schiffer,et al.  Covariation of amino acid positions in HIV-1 protease. , 2003, Virology.

[48]  B. Rost,et al.  Effective use of sequence correlation and conservation in fold recognition. , 1999, Journal of molecular biology.

[49]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.