Protein-ligand binding region prediction (PLB-SAVE) based on geometric features and CUDA acceleration

BackgroundProtein-ligand interactions are key processes in triggering and controlling biological functions within cells. Prediction of protein binding regions on the protein surface assists in understanding the mechanisms and principles of molecular recognition. In silico geometrical shape analysis plays a primary step in analyzing the spatial characteristics of protein binding regions and facilitates applications of bioinformatics in drug discovery and design. Here, we describe the novel software, PLB-SAVE, which uses parallel processing technology and is ideally suited to extract the geometrical construct of solid angles from surface atoms. Representative clusters and corresponding anchors were identified from all surface elements and were assigned according to the ranking of their solid angles. In addition, cavity depth indicators were obtained by proportional transformation of solid angles and cavity volumes were calculated by scanning multiple directional vectors within each selected cavity. Both depth and volume characteristics were combined with various weighting coefficients to rank predicted potential binding regions.ResultsTwo test datasets from LigASite, each containing 388 bound and unbound structures, were used to predict binding regions using PLB-SAVE and two well-known prediction systems, SiteHound and MetaPocket2.0 (MPK2). PLB-SAVE outperformed the other programs with accuracy rates of 94.3% for unbound proteins and 95.5% for bound proteins via a tenfold cross-validation process. Additionally, because the parallel processing architecture was designed to enhance the computational efficiency, we obtained an average of 160-fold increase in computational time.ConclusionsIn silico binding region prediction is considered the initial stage in structure-based drug design. To improve the efficacy of biological experiments for drug development, we developed PLB-SAVE, which uses only geometrical features of proteins and achieves a good overall performance for protein-ligand binding region prediction. Based on the same approach and rationale, this method can also be applied to predict carbohydrate-antibody interactions for further design and development of carbohydrate-based vaccines. PLB-SAVE is available at http://save.cs.ntou.edu.tw.

[1]  P. Bourne,et al.  Exploiting sequence and structure homologs to identify protein–protein binding sites , 2005, Proteins.

[2]  Mona Singh,et al.  Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure , 2009, PLoS Comput. Biol..

[3]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[4]  Julie C. Mitchell,et al.  CUSA and CUDE: GPU-Accelerated Methods for Estimating Solvent Accessible Surface Area and Desolvation , 2009, J. Comput. Biol..

[5]  M. L. Connolly Measurement of protein surface shape by solid angles , 1986 .

[6]  Hiroki Shirai,et al.  Use of Amino Acid Composition to Predict Ligand-Binding Sites , 2007, J. Chem. Inf. Model..

[7]  Mohammed J. Zaki,et al.  Context shapes: Efficient complementary shape matching for protein–protein docking , 2008, Proteins.

[8]  Cole Trapnell,et al.  Optimizing data intensive GPGPU computations for DNA sequence alignment , 2009, Parallel Comput..

[9]  Nikolaos V. Sahinidis,et al.  GPU-BLAST: using graphics processors to accelerate protein sequence alignment , 2010, Bioinform..

[10]  M. Schroeder,et al.  LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[11]  T. Kawabata Detection of multiscale pockets on protein surfaces using mathematical morphology , 2010, Proteins.

[12]  I. Kuntz,et al.  Surface solid angle-based site points for molecular docking. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[13]  J. Janin,et al.  A dissection of specific and non-specific protein-protein interfaces. , 2004, Journal of molecular biology.

[14]  M J Sternberg,et al.  New algorithm to model protein-protein recognition based on surface complementarity. Applications to antibody-antigen docking. , 1992, Journal of molecular biology.

[15]  Amitabh Varshney,et al.  Parallel, stochastic measurement of molecular surface area. , 2008, Journal of molecular graphics & modelling.

[16]  David R. Westhead,et al.  Improved prediction of protein-protein binding sites using a support vector machines approach. , 2005, Bioinformatics.

[17]  Shoshana J. Wodak,et al.  LigASite—a database of biologically relevant binding sites in proteins with known apo-structures , 2007, Nucleic Acids Res..

[18]  R. Raz,et al.  ProMate: a structure based prediction program to identify the location of protein-protein binding sites. , 2004, Journal of molecular biology.

[19]  Joël Janin,et al.  Genome-wide studies of protein-protein interaction. , 2003, Current opinion in structural biology.

[20]  Vincent Le Guilloux,et al.  Fpocket: An open source platform for ligand pocket detection , 2009, BMC Bioinformatics.

[21]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[22]  Lorenzo Dematté,et al.  GPU computing for systems biology , 2010, Briefings Bioinform..

[23]  David W. Ritchie,et al.  Ultra-fast FFT protein docking on graphics processors , 2010, Bioinform..

[24]  Dario Ghersi,et al.  SITEHOUND-web: a server for ligand binding site identification in protein structures , 2009, Nucleic Acids Res..

[25]  Rani E. George,et al.  Promising Therapeutic Targets in Neuroblastoma , 2012, Clinical Cancer Research.

[26]  Z. Weng,et al.  ZDOCK: An initial‐stage protein‐docking algorithm , 2003, Proteins.

[27]  白敦文,et al.  Protein-ligand binding region prediction (PLB-SAVE) based on geometric features and CUDA acceleration , 2013 .

[28]  Stephen R. Comeau,et al.  PIPER: An FFT‐based protein docking program with pairwise potentials , 2006, Proteins.

[29]  Yu Li,et al.  Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction , 2011, Bioinform..

[30]  Song Liu,et al.  Protein binding site prediction using an empirical scoring function , 2006, Nucleic acids research.

[31]  Richard M. Jackson,et al.  Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces , 2006, Bioinform..

[32]  Richard M. Jackson,et al.  Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[33]  Salvatore Lanzavecchia,et al.  Alignment of 3D structures of macromolecular assemblies , 2001, Bioinform..

[34]  L. Bonetta Protein–protein interactions: Interactome under construction , 2010, Nature.

[35]  Martin Zacharias,et al.  In silico prediction of binding sites on proteins. , 2010, Current medicinal chemistry.

[36]  Yong Zhou,et al.  Roll: a new algorithm for the detection of protein pockets and cavities with a rolling probe sphere , 2010, Bioinform..

[37]  Vijay S. Pande,et al.  Accelerating molecular dynamic simulation on graphics processing units , 2009, J. Comput. Chem..

[38]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[39]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[40]  Dennis R. Burton,et al.  Carbohydrate vaccines: developing sweet solutions to sticky situations? , 2010, Nature Reviews Drug Discovery.

[41]  E. Katchalski‐Katzir,et al.  Molecular surface recognition: determination of geometric fit between proteins and their ligands by correlation techniques. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[42]  R. Bahadur,et al.  The interface of protein-protein complexes: Analysis of contacts and prediction of interactions , 2008, Cellular and Molecular Life Sciences.

[43]  Amitabh Varshney,et al.  High-throughput sequence alignment using Graphics Processing Units , 2007, BMC Bioinformatics.