Computational Prediction of Carbohydrate‐Binding Proteins and Binding Sites

Protein‐carbohydrate interaction is essential for biological systems, and carbohydrate‐binding proteins (CBPs) are important targets when designing antiviral and anticancer drugs. Due to the high cost and difficulty associated with experimental approaches, many computational methods have been developed as complementary approaches to predict CBPs or carbohydrate‐binding sites. However, most of these computational methods are not publicly available. Here, we provide a comprehensive review of related studies and demonstrate our two recently developed bioinformatics methods. The method SPOT‐CBP is a template‐based method for detecting CBPs based on structure through structural homology search combined with a knowledge‐based scoring function. This method can yield model complex structure in addition to accurate prediction of CBPs. Furthermore, it has been observed that similarly accurate predictions can be made using structures from homology modeling, which has significantly expanded its applicability. The other method, SPRINT‐CBH, is a de novo approach that predicts binding residues directly from protein sequences by using sequence information and predicted structural properties. This approach does not need structurally similar templates and thus is not limited by the current database of known protein‐carbohydrate complex structures. These two complementary methods are available at https://sparks‐lab.org. © 2018 by John Wiley & Sons, Inc.

[1]  Elizabeth Yuriev,et al.  Challenges and advances in computational docking: 2009 in review , 2011, Journal of molecular recognition : JMR.

[2]  M Michael Gromiha,et al.  Identification and analysis of binding site residues in protein-carbohydrate complexes using energy based approach. , 2014, Protein and peptide letters.

[3]  Milan Mrksich,et al.  Carbohydrate arrays for the evaluation of protein binding and enzymatic modification. , 2002, Chemistry & biology.

[4]  James G. Lyons,et al.  Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning , 2015, Scientific Reports.

[5]  Xin Chen,et al.  dbCAN: a web resource for automated carbohydrate-active enzyme annotation , 2012, Nucleic Acids Res..

[6]  Yaoqi Zhou,et al.  Carbohydrate‐binding protein identification by coupling structural similarity searching with binding affinity prediction , 2014, J. Comput. Chem..

[7]  Yang Zhang,et al.  COFACTOR: an accurate comparative algorithm for structure-based protein function annotation , 2012, Nucleic Acids Res..

[8]  Yuedong Yang,et al.  Predicting DNA-Binding Proteins and Binding Residues by Complex Structure Prediction and Application to Human Proteome , 2014, PloS one.

[9]  Yaoqi Zhou,et al.  A new size‐independent score for pairwise protein structure alignment and its application to structure classification and nucleic‐acid binding prediction , 2012, Proteins.

[10]  A. Sali,et al.  Structural genomics: beyond the Human Genome Project , 1999, Nature Genetics.

[11]  Mark von Itzstein,et al.  The war against influenza: discovery and development of sialidase inhibitors. , 2007, Nature reviews. Drug discovery.

[12]  Serge Pérez,et al.  Glyco3D: a portal for structural glycosciences. , 2015, Methods in molecular biology.

[13]  Petety V Balaji,et al.  Identification of common structural features of binding sites in galactose‐specific proteins , 2004, Proteins.

[14]  Alan Wee-Chung Liew,et al.  Sequence‐based prediction of protein–peptide binding sites using support vector machine , 2016, J. Comput. Chem..

[15]  Jeffrey Skolnick,et al.  PoLi: A Virtual Screening Pipeline Based on Template Pocket and Ligand Similarity , 2015, J. Chem. Inf. Model..

[16]  Yaoqi Zhou,et al.  Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function , 2010, Bioinform..

[17]  D. Stuart,et al.  More powerful virus inhibitors from structure-based analysis of HEV71 capsid-binding molecules , 2014, Nature Structural &Molecular Biology.

[18]  Mark von Itzstein,et al.  The war against influenza: discovery and development of sialidase inhibitors , 2007, Nature Reviews Drug Discovery.

[19]  Dirk Neumann,et al.  BALLDock/SLICK: A New Method for Protein-Carbohydrate Docking , 2008, J. Chem. Inf. Model..

[20]  U. Heinemann,et al.  High-resolution crystal structures of Caldicellulosiruptor strain Rt8B.4 carbohydrate-binding module CBM27-1 and its complex with mannohexaose. , 2004, Journal of molecular biology.

[21]  Sukanta Mondal,et al.  MOWGLI: prediction of protein–MannOse interacting residues With ensemble classifiers usinG evoLutionary Information , 2016, Journal of biomolecular structure & dynamics.

[22]  Yaoqi Zhou,et al.  Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates , 2011, Bioinform..

[23]  Kuldip K. Paliwal,et al.  Sixty-five years of the long march in protein secondary structure prediction: the final stretch? , 2016, Briefings Bioinform..

[24]  David F. Smith,et al.  Cell attachment protein VP8* of a human rotavirus specifically interacts with A-type histo-blood group antigen , 2012, Nature.

[25]  Nitish Kumar Mishra,et al.  Identification of Mannose Interacting Residues Using Local Composition , 2011, PloS one.

[26]  Wen-Lian Hsu,et al.  Prediction of Carbohydrate Binding Sites on Protein Surfaces with 3-Dimensional Probability Density Distributions of Interacting Atoms , 2012, PloS one.

[27]  Pedro M. Coutinho,et al.  The carbohydrate-active enzymes database (CAZy) in 2013 , 2013, Nucleic Acids Res..

[28]  Jeffrey Skolnick,et al.  DBD-Hunter: a knowledge-based method for the prediction of DNA–protein interactions , 2008, Nucleic acids research.

[29]  J M Thornton,et al.  Analysis and prediction of carbohydrate binding sites. , 2000, Protein engineering.

[30]  J. Skolnick,et al.  A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation , 2008, Proceedings of the National Academy of Sciences.

[31]  Kuldip K. Paliwal,et al.  Highly accurate sequence-based prediction of half-sphere exposures of amino acid residues in proteins , 2016, Bioinform..

[32]  J. Balzarini,et al.  Potential of carbohydrate‐binding agents as therapeutics against enveloped viruses , 2010, Medicinal research reviews.

[33]  Yuedong Yang,et al.  Highly accurate and high-resolution function prediction of RNA binding proteins by fold recognition and binding affinity prediction , 2011, RNA biology.

[34]  Lukasz A. Kurgan,et al.  A comprehensive comparative review of sequence-based predictors of DNA- and RNA-binding residues , 2016, Briefings Bioinform..

[35]  Yaoqi Zhou,et al.  Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations. , 2011, Journal of molecular biology.

[36]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[37]  Yaoqi Zhou,et al.  SPOT‐Ligand: Fast and effective structure‐based virtual screening by binding homology search according to ligand and receptor similarity , 2016, J. Comput. Chem..

[38]  Hassan Al-Ali,et al.  Prediction of protein‐glucose binding sites using support vector machines , 2009, Proteins.

[39]  S. Nakahara,et al.  Biological modulation by lectins and their ligands in tumor progression and metastasis. , 2008, Anti-cancer agents in medicinal chemistry.

[40]  Yaoqi Zhou,et al.  Ab initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all‐atom statistical energy functions , 2008, Protein science : a publication of the Protein Society.

[41]  M. Heise,et al.  A Single-Amino-Acid Polymorphism in Chikungunya Virus E2 Glycoprotein Influences Glycosaminoglycan Utilization , 2013, Journal of Virology.

[42]  Petras J. Kundrotas,et al.  Template-Based Modeling of Protein-RNA Interactions , 2016, PLoS Comput. Biol..

[43]  Shandar Ahmad,et al.  PROCARB: A Database of Known and Modelled Carbohydrate-Binding Protein Structures with Sequence-Based Prediction Tools , 2010, Adv. Bioinformatics.

[44]  Takashi Yamane,et al.  An empirical approach for structure-based prediction of carbohydrate-binding sites on proteins. , 2003, Protein engineering.

[45]  J. Lowe,et al.  Role of glycosylation in development. , 2003, Annual review of biochemistry.

[46]  Yaoqi Zhou,et al.  SPOT‐ligand 2: improving structure‐based virtual screening by binding‐homology search on an expanded structural template library , 2017, Bioinform..

[47]  Nicolai V Bovin,et al.  A guide into glycosciences: How chemistry, biochemistry and biology cooperate to crack the sugar code. , 2015, Biochimica et biophysica acta.

[48]  Richard D Cummings,et al.  Protein glycosylation in cancer. , 2015, Annual review of pathology.

[49]  Alan Wee-Chung Liew,et al.  Sequence-Based Prediction of Protein-Carbohydrate Binding Sites Using Support Vector Machines , 2016, J. Chem. Inf. Model..

[50]  Mahesh Kulharia,et al.  InCa-SiteFinder: a method for structure-based prediction of inositol and carbohydrate binding sites on proteins. , 2009, Journal of molecular graphics & modelling.

[51]  S. Hakomori Tumor malignancy defined by aberrant glycosylation and sphingo(glyco)lipid metabolism. , 1996, Cancer research.

[52]  Yaoqi Zhou,et al.  Structure-based prediction of RNA-binding domains and RNA-binding sites and application to structural genomics targets , 2010, Nucleic acids research.