Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction

MOTIVATION Protein-ligand binding sites are the active sites on protein surface that perform protein functions. Thus, the identification of those binding sites is often the first step to study protein functions and structure-based drug design. There are many computational algorithms and tools developed in recent decades, such as LIGSITE(cs/c), PASS, Q-SiteFinder, SURFNET, and so on. In our previous work, MetaPocket, we have proved that it is possible to combine the results of many methods together to improve the prediction result. RESULTS Here, we continue our previous work by adding four more methods Fpocket, GHECOM, ConCavity and POCASA to further improve the prediction success rate. The new method MetaPocket 2.0 and the individual approaches are all tested on two datasets of 48 unbound/bound and 210 bound structures as used before. The results show that the average success rate has been raised 5% at the top 1 prediction compared with previous work. Moreover, we construct a non-redundant dataset of drug-target complexes with known structure from DrugBank, DrugPort and PDB database and apply MetaPocket 2.0 to this dataset to predict drug binding sites. As a result, >74% drug binding sites on protein target are correctly identified at the top 3 prediction, and it is 12% better than the best individual approach. AVAILABILITY The web service of MetaPocket 2.0 and all the test datasets are freely available at http://projects.biotec.tu-dresden.de/metapocket/ and http://sysbio.zju.edu.cn/metapocket.

[1]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[2]  Dario Ghersi,et al.  EASYMIFS and SITEHOUND: a toolkit for the identification of ligand-binding sites in protein structures , 2009, Bioinform..

[3]  Dario Ghersi,et al.  SITEHOUND-web: a server for ligand binding site identification in protein structures , 2009, Nucleic Acids Res..

[4]  Mona Singh,et al.  Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure , 2009, PLoS Comput. Biol..

[5]  Herbert Edelsbrunner,et al.  Three-dimensional alpha shapes , 1992, VVS.

[6]  G. Schneider,et al.  PocketPicker: analysis of ligand binding-sites with shape descriptors , 2007, Chemistry Central Journal.

[7]  Jie Liang,et al.  CASTp: Computed Atlas of Surface Topography of proteins , 2003, Nucleic Acids Res..

[8]  X. Barril,et al.  Understanding and predicting druggability. A high-throughput method for detection of drug binding sites. , 2010, Journal of medicinal chemistry.

[9]  M. Schroeder,et al.  LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[10]  T. Kawabata Detection of multiscale pockets on protein surfaces using mathematical morphology , 2010, Proteins.

[11]  P. Hajduk,et al.  Druggability indices for protein targets derived from NMR-based screening data. , 2005, Journal of medicinal chemistry.

[12]  Bingding Huang,et al.  MetaPocket: a meta approach to improve protein ligand binding site prediction. , 2009, Omics : a journal of integrative biology.

[13]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[14]  Jie Liang,et al.  CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues , 2006, Nucleic Acids Res..

[15]  Yong Zhou,et al.  Roll: a new algorithm for the detection of protein pockets and cavities with a rolling probe sphere , 2010, Bioinform..

[16]  P. Hajduk,et al.  Predicting protein druggability. , 2005, Drug discovery today.

[17]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[18]  M Hendlich,et al.  LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. , 1997, Journal of molecular graphics & modelling.

[19]  Nobuyoshi Sugaya,et al.  Assessing the druggability of protein-protein interactions by a supervised machine-learning method , 2009, BMC Bioinformatics.

[20]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[21]  Vincent Le Guilloux,et al.  Fpocket: An open source platform for ligand pocket detection , 2009, BMC Bioinformatics.

[22]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[23]  Takeshi Kawabata,et al.  Detection of pockets on protein surfaces using small and large probe spheres to find putative ligand binding sites , 2007, Proteins.

[24]  B. Honig,et al.  On the nature of cavities on protein surfaces: Application to the identification of drug‐binding sites , 2006, Proteins.

[25]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[26]  Richard M. Jackson,et al.  Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[27]  Daniel R. Caffrey,et al.  Structure-based maximal affinity model predicts small-molecule druggability , 2007, Nature Biotechnology.

[28]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..