Incorporating the Coevolving Information of Substrates in Predicting HIV-1 Protease Cleavage Sites

Human immunodeficiency virus 1 (HIV-1) protease (PR) plays a crucial role in the maturation of the virus. The study of substrate specificity of HIV-1 PR as a new endeavor strives to increase our ability to understand how HIV-1 PR recognizes its various cleavage sites. To predict HIV-1 PR cleavage sites, most of the existing approaches have been developed solely based on the homogeneity of substrate sequence information with supervised classification techniques. Although efficient, these approaches are found to be restricted to the ability of explaining their results and probably provide few insights into the mechanisms by which HIV-1 PR cleaves the substrates in a site-specific manner. In this work, a coevolutionary pattern-based prediction model for HIV-1 PR cleavage sites, namely EvoCleave, is proposed by integrating the coevolving information obtained from substrate sequences with a linear SVM classifier. The experiment results showed that EvoCleave yielded a very promising performance in terms of ROC analysis and <inline-formula><tex-math notation="LaTeX">$f$</tex-math><alternatives><mml:math><mml:mi>f</mml:mi></mml:math><inline-graphic xlink:href="hu-ieq1-2914208.gif"/></alternatives></inline-formula>-measure. We also prospectively assessed the biological significance of coevolutionary patterns by applying them to study three fundamental issues of HIV-1 PR cleavage site. The analysis results demonstrated that the coevolutionary patterns offered valuable insights into the understanding of substrate specificity of HIV-1 PR.

[1]  K. Chou Prediction of human immunodeficiency virus protease cleavage sites in proteins. , 1996, Analytical biochemistry.

[2]  Thomas Lengauer,et al.  Geno2pheno: estimating phenotypic drug resistance from HIV-1 genotypes , 2003, Nucleic Acids Res..

[3]  S. Chanda,et al.  HIV-1 protease cleaves the serine-threonine kinases RIPK1 and RIPK2 , 2015, Retrovirology.

[4]  Keith Hoots,et al.  Epistatic interaction between KIR3DS1 and HLA-B delays the progression to AIDS , 2002, Nature Genetics.

[5]  Hyeoncheol Kim,et al.  An MLP-based feature subset selection for HIV-1 protease cleavage site analysis , 2010, Artif. Intell. Medicine.

[6]  Thorsteinn S. Rögnvaldsson,et al.  Why neural networks should not be used for HIV-1 protease cleavage site prediction , 2004, Bioinform..

[7]  Jan Komorowski,et al.  Computational proteomics analysis of HIV‐1 protease interactome , 2007, Proteins.

[8]  Tansel Özyer,et al.  A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction , 2013, PloS one.

[9]  A. Wlodawer,et al.  Structure-based inhibitors of HIV-1 protease. , 1993, Annual review of biochemistry.

[10]  Hasan Ogul Variable context Markov chains for HIV protease cleavage site prediction , 2009, Biosyst..

[11]  Frank Noé,et al.  Kinetic characterization of the critical step in HIV-1 protease maturation , 2012, Proceedings of the National Academy of Sciences.

[12]  A Wlodawer,et al.  Molecular modeling of the HIV-1 protease and its substrate binding site. , 1989, Science.

[13]  Kuo-Chen Chou,et al.  HIVcleave: a web-server for predicting human immunodeficiency virus protease cleavage sites in proteins. , 2008, Analytical biochemistry.

[14]  R. Shafer,et al.  Genotypic predictors of human immunodeficiency virus type 1 drug resistance , 2006, Proceedings of the National Academy of Sciences.

[15]  David L. Robertson,et al.  An integrated view of molecular coevolution in protein-protein interactions. , 2010, Molecular biology and evolution.

[16]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[18]  C. Schiffer,et al.  How does a symmetric dimer recognize an asymmetric substrate? A substrate complex of HIV-1 protease. , 2000, Journal of molecular biology.

[19]  Thorsteinn S. Rögnvaldsson,et al.  Comprehensive Bioinformatic Analysis of the Specificity of Human Immunodeficiency Virus Type 1 Protease , 2005, Journal of Virology.

[20]  M. Parrinello,et al.  Ab initio molecular dynamics-based assignment of the protonation state of pepstatin A/HIV-1 protease cleavage site. , 2001, Journal of the American Chemical Society.

[21]  Celia A. Schiffer,et al.  Structural Basis for Coevolution of a Human Immunodeficiency Virus Type 1 Nucleocapsid-p1 Cleavage Site with a V82A Drug-Resistant Mutation in Viral Protease , 2004, Journal of Virology.

[22]  Andreas Tholey,et al.  Mass spectrometry‐based proteomics strategies for protease cleavage site identification , 2012, Proteomics.

[23]  A. Wlodawer,et al.  Different requirements for productive interaction between the active site of HIV-1 proteinase and substrates containing -hydrophobic*hydrophobic- or -aromatic*pro- cleavage sites. , 1992, Biochemistry.

[24]  Emily Chia-Yu Su,et al.  Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features , 2016, BMC Bioinformatics.

[25]  Keith C. C. Chan,et al.  Utilizing Both Topological and Attribute Information for Protein Complex Identification in PPI Networks , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[26]  J. Falloon,et al.  Drug resistance during indinavir therapy is caused by mutations in the protease gene and in its Gag substrate cleavage sites , 1997, Journal of virology.

[27]  Loris Nanni,et al.  Using ensemble of classifiers for predicting HIV protease cleavage sites in proteins , 2009, Amino Acids.

[28]  Peter G Wolynes,et al.  Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection , 2014, Proceedings of the National Academy of Sciences.

[29]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[30]  Keith C. C. Chan,et al.  Fuzzy Clustering in a Complex Network Based on Content Relevance and Link Structures , 2016, IEEE Transactions on Fuzzy Systems.

[31]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[32]  Eyke Hüllermeier,et al.  Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification , 2016, BioData Mining.

[33]  David Haussler,et al.  Detecting Coevolution in and among Protein Domains , 2007, PLoS Comput. Biol..

[34]  Xing Chen,et al.  LMTRDA: Using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities , 2019, PLoS Comput. Biol..

[35]  Gholamreza Haffari,et al.  PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy , 2018, Bioinform..

[36]  Simon A. A. Travers,et al.  A study of the coevolutionary patterns operating within the env gene of the HIV-1 group M subtypes. , 2007, Molecular biology and evolution.

[37]  A Wlodawer,et al.  Structure of complex of synthetic HIV-1 protease with a substrate-based inhibitor at 2.3 A resolution. , 1989, Science.

[38]  Sami Mahrus,et al.  Altered Substrate Specificity of Drug-Resistant Human Immunodeficiency Virus Type 1 Protease , 2002, Journal of Virology.

[39]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[40]  Andrew K. C. Wong,et al.  Learning sequential patterns for probabilistic inductive prediction , 1994 .

[41]  Keith C. C. Chan,et al.  Extracting Coevolutionary Features from Protein Sequences for Predicting Protein-Protein Interactions , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[42]  Changchuan Yin,et al.  A coevolution analysis for identifying protein-protein interactions by Fourier transform , 2017, PloS one.

[43]  Thorsteinn S. Rögnvaldsson,et al.  State of the art prediction of HIV-1 protease cleavage sites , 2015, Bioinform..

[44]  Xuehua Li,et al.  Predicting human immunodeficiency virus protease cleavage sites in nonlinear projection space , 2010, Molecular and Cellular Biochemistry.

[45]  Michael K. Gilson,et al.  Evaluating the Substrate-Envelope Hypothesis: Structural Analysis of Novel HIV-1 Protease Inhibitors Designed To Be Robust against Drug Resistance , 2010, Journal of Virology.

[46]  A. Valencia,et al.  Emerging methods in protein co-evolution , 2013, Nature Reviews Genetics.

[47]  Simon A. A. Travers,et al.  A Novel Method for Detecting Intramolecular Coevolution: Adding a Further Dimension to Selective Constraints Analyses , 2006, Genetics.

[48]  C. Schiffer,et al.  Substrate shape determines specificity of recognition for HIV-1 protease: analysis of crystal structures of six substrate complexes. , 2002, Structure.

[49]  C. Schiffer,et al.  Dynamics of preferential substrate recognition in HIV-1 protease: redefining the substrate envelope. , 2011, Journal of molecular biology.

[50]  Zachary Q. Beck,et al.  Identification of efficiently cleaved substrates for HIV-1 protease using a phage display library and use in inhibitor development. , 2000, Virology.