论文信息 - HemeBIND: a novel method for heme binding residue prediction by combining structural and sequence information

HemeBIND: a novel method for heme binding residue prediction by combining structural and sequence information

BackgroundAccurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues.ResultsHere we introduced an efficient algorithm HemeBIND for predicting heme binding residues by integrating structural and sequence information. We systematically investigated the characteristics of binding interfaces based on a non-redundant dataset of heme-protein complexes. It was found that several sequence and structural attributes such as evolutionary conservation, solvent accessibility, depth and protrusion clearly illustrate the differences between heme binding and non-binding residues. These features can then be separately used or combined to build the structure-based classifiers using support vector machine (SVM). The results showed that the information contained in these features is largely complementary and their combination achieved the best performance. To further improve the performance, an attempt has been made to develop a post-processing procedure to reduce the number of false positives. In addition, we built a sequence-based classifier based on SVM and sequence profile as an alternative when only sequence information can be used. Finally, we employed a voting method to combine the outputs of structure-based and sequence-based classifiers, which demonstrated remarkably better performance than the individual classifier alone.ConclusionsHemeBIND is the first specialized algorithm used to predict binding residues in protein structures for heme ligands. Extensive experiments indicated that both the structure-based and sequence-based methods have effectively identified heme binding residues while the complementary relationship between them can result in a significant improvement in prediction performance. The value of our method is highlighted through the development of HemeBIND web server that is freely accessible at http://mleg.cse.sc.edu/hemeBIND/.

Jianjun Hu | Rong Liu | Jianjun Hu | Rong Liu

[1] Gajendra P. S. Raghava,et al. Identification of ATP binding residues of a protein from its primary sequence , 2009, BMC Bioinformatics.

[2] B. Rost,et al. Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[3] M. Schroeder,et al. LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[4] P. Escribá,et al. Regulation of heme oxygenase and metallothionein gene expression by the heme analogs, cobalt-, and tin-protoporphyrin. , 1993, The Journal of biological chemistry.

[5] Jaime Prilusky,et al. Automated analysis of interatomic contacts in proteins , 1999, Bioinform..

[6] Noriyuki Igarashi,et al. The 2.8 Å structure of hydroxylamine oxidoreductase from a nitrifying chemoautotrophic bacterium, Nitrosomonas europaea , 1997, Nature Structural Biology.

[7] Michal Brylinski,et al. FINDSITELHM: A Threading-Based Approach to Ligand Homology Modeling , 2009, PLoS Comput. Biol..

[8] D T Jones,et al. Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[9] Jon Marles-Wright,et al. Diversity and conservation of interactions for binding heme in b-type heme proteins. , 2007, Natural product reports.

[10] Richard M. Jackson,et al. Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[11] S. Jones,et al. Analysis of protein-protein interaction sites using surface patches. , 1997, Journal of molecular biology.

[12] J. Thornton,et al. A method for localizing ligand binding pockets in protein structures , 2005, Proteins.

[13] Charles J. Reedy,et al. Heme protein assemblies. , 2004, Chemical reviews.

[14] W. Kabsch,et al. Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[15] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[16] G. Vriend,et al. Molecular docking using surface complementarity , 1996, Proteins.

[17] Benjamin A. Shoemaker,et al. Knowledge-based annotation of small molecule binding sites in proteins , 2010, BMC Bioinformatics.

[18] Gajendra P. S. Raghava,et al. Identification of NAD interacting residues in proteins , 2010, BMC Bioinformatics.

[19] R. Laskowski. SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[20] D. Levitt,et al. POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. , 1992, Journal of molecular graphics.

[21] Oliviero Carugo,et al. CX, an algorithm that identifies protruding atoms in proteins , 2002, Bioinform..

[22] J. Winkler,et al. Electron Transfer In Proteins , 1997, QELS '97., Summaries of Papers Presented at the Quantum Electronics and Laser Science Conference.

[23] Kenneth A Johnson,et al. The second enzyme in pyrrolnitrin biosynthetic pathway is related to the heme-dependent dioxygenase superfamily. , 2007, Biochemistry.

[24] Gajendra P. S. Raghava,et al. Open Access Research Article Prediction of Gtp Interacting Residues, Dipeptides and Tripeptides in a Protein from Its Evolutionary Information , 2022 .

[25] Gajendra P. S. Raghava,et al. Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information , 2010, BMC Bioinformatics.

[26] Oliviero Carugo,et al. DPX: for the analysis of the protein core , 2003, Bioinform..

[27] M Hendlich,et al. LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. , 1997, Journal of molecular graphics & modelling.

[28] O. Schueler‐Furman,et al. Conserved residue clustering and protein structure prediction , 2003, Proteins.

[29] R. Wade,et al. Computational approaches to identifying and characterizing protein binding sites for ligand design , 2009, Journal of molecular recognition : JMR.

[30] N B Terwilliger,et al. Functional adaptations of oxygen-transport proteins. , 1998, The Journal of experimental biology.

[31] Vladimir Vapnik,et al. The Nature of Statistical Learning , 1995 .

[32] Li Zhang,et al. Heme: a versatile signaling molecule controlling the activities of diverse regulators ranging from transcription factors to MAP kinases , 2006, Cell Research.

[33] W. Delano. The PyMOL Molecular Graphics System , 2002 .

[34] G. Schneider,et al. PocketPicker: analysis of ligand binding-sites with shape descriptors , 2007, Chemistry Central Journal.