ccPDB 2.0: an updated version of datasets created and compiled from Protein Data Bank

Abstract ccPDB 2.0 (http://webs.iiitd.edu.in/raghava/ccpdb) is an updated version of the manually curated database ccPDB that maintains datasets required for developing methods to predict the structure and function of proteins. The number of datasets compiled from literature increased from 45 to 141 in ccPDB 2.0. Similarly, the number of protein structures used for creating datasets also increased from ~74 000 to ~137 000 (PDB March 2018 release). ccPDB 2.0 provides the same web services and flexible tools which were present in the previous version of the database. In the updated version, links of the number of methods developed in the past few years have also been incorporated. This updated resource is built on responsive templates which is compatible with smartphones (mobile, iPhone, iPad, tablets etc.) and large screen gadgets. In summary, ccPDB 2.0 is a user-friendly web-based platform that provides comprehensive as well as updated information about datasets.

[1]  Jun Hu,et al.  TargetATPsite: A template‐free method for ATP‐binding sites prediction with residue evolution image sparse representation and classifier ensemble , 2013, J. Comput. Chem..

[2]  Ujjwal Maulik,et al.  DBETH: A Database of Bacterial Exotoxins for Human , 2011, Nucleic Acids Res..

[3]  Janet M Thornton,et al.  Toward predicting protein topology: an approach to identifying beta hairpins. , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[4]  A. Alix,et al.  High accuracy prediction of β‐turns and their types using propensities and multiple alignments , 2005 .

[5]  Yang Zhang,et al.  Recognizing metal and acid radical ion-binding sites by integrating ab initio modeling with template-based transferals , 2016, Bioinform..

[6]  Chao Wang,et al.  ProClusEnsem: Predicting membrane protein types by fusing different modes of pseudo amino acid composition , 2012, Comput. Biol. Medicine.

[7]  Liubin Feng,et al.  Crysalis: an integrated server for computational analysis and design of protein crystallization , 2016, Scientific Reports.

[8]  Gajendra P. S. Raghava,et al.  Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information , 2010, BMC Bioinformatics.

[9]  Lukasz A. Kurgan,et al.  Sequence-based prediction of protein crystallization, purification and production propensity , 2011, Bioinform..

[10]  Hong-Bin Shen,et al.  MemBrain-contact 2.0: a new two-stage machine learning model for the prediction enhancement of transmembrane protein residue contacts in the full chain , 2018, Bioinform..

[11]  Yang Zhang,et al.  Protein Structure and Function Prediction Using I‐TASSER , 2015, Current protocols in bioinformatics.

[12]  Ping Zhu,et al.  MimoDB 2.0: a mimotope database and beyond , 2011, Nucleic Acids Res..

[13]  Gajendra Pal Singh Raghava,et al.  Prediction of β‐turns in proteins from multiple alignment using neural network , 2003, Protein science : a publication of the Protein Society.

[14]  Jonathan D. Hirst,et al.  Predicting β-turns and their types using predicted backbone dihedral angles and secondary structures , 2010, BMC Bioinformatics.

[15]  Lukasz Kurgan,et al.  Meta prediction of protein crystallization propensity. , 2009, Biochemical and biophysical research communications.

[16]  Vasant Honavar,et al.  Predicting RNA-Protein Interactions Using Only Sequence Information , 2011, BMC Bioinformatics.

[17]  J. Thornton,et al.  Satisfying hydrogen bonding potential in proteins. , 1994, Journal of molecular biology.

[18]  Alexandre G. de Brevern,et al.  PolyprOnline: polyproline helix II and secondary structure assignment database , 2014, Database J. Biol. Databases Curation.

[19]  Caroline Louis-Jeune,et al.  Prediction of protein secondary structure from circular dichroism using theoretically derived spectra , 2012, Proteins.

[20]  Burkhard Rost,et al.  New in protein structure and function annotation: hotspots, single nucleotide polymorphisms and the 'Deep Web'. , 2009, Current opinion in drug discovery & development.

[21]  P Manikandan,et al.  PATSIM: Prediction and analysis of protein sequences using hybrid Knuth-Morris Pratt (KMP) and Boyer-Moore (BM) algorithm. , 2018, Gene.

[22]  Lukasz A. Kurgan,et al.  Prediction and analysis of nucleotide-binding residues using sequence and sequence-derived structural descriptors , 2012, Bioinform..

[23]  Gert Vriend,et al.  A series of PDB related databases for everyday needs , 2010, Nucleic Acids Res..

[24]  Wen-Jun Shen,et al.  RPiRLS: Quantitative Predictions of RNA Interacting with Any Protein of Known Sequence , 2018, Molecules.

[25]  Gajendra P. S. Raghava,et al.  Identification of NAD interacting residues in proteins , 2010, BMC Bioinformatics.

[26]  Jun Hu,et al.  ATPbind: Accurate Protein-ATP Binding Site Prediction by Combining Sequence-Profiling and Structure-Based Comparisons , 2018, J. Chem. Inf. Model..

[27]  Gert Vriend,et al.  Everyday , 2020, Oxford Research Encyclopedia of Literature.

[28]  Sitao Wu,et al.  ANGLOR: A Composite Machine-Learning Algorithm for Protein Backbone Torsion Angle Prediction , 2008, PloS one.

[29]  J. Thornton,et al.  PROMOTIF—A program to identify and analyze structural motifs in proteins , 1996, Protein science : a publication of the Protein Society.

[30]  David S. Wishart,et al.  PREDITOR: a web server for predicting protein torsion angle restraints , 2006, Nucleic Acids Res..

[31]  Jaime Prilusky,et al.  Automated analysis of interatomic contacts in proteins , 1999, Bioinform..

[32]  Andrew C. R. Martin,et al.  AbDb: antibody structure database—a database of PDB-derived antibody structures , 2018, Database J. Biol. Databases Curation.

[33]  Ilya A Vakser,et al.  Rotamer libraries and probabilities of transition between rotamers for the side chains in protein–protein binding , 2012, Proteins.

[34]  Christian Cole,et al.  JPred4: a protein secondary structure prediction server , 2015, Nucleic Acids Res..

[35]  Sheng Wang,et al.  RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning , 2018, BMC Bioinformatics.

[36]  Jie Hou,et al.  Deep learning methods for protein torsion angle prediction , 2017, BMC Bioinformatics.

[37]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[38]  Jianzhu Ma,et al.  AcconPred: Predicting Solvent Accessibility and Contact Number Simultaneously by a Multitask Learning Framework under the Conditional Neural Fields Model , 2015, BioMed research international.

[39]  Lukasz Kurgan,et al.  Comprehensively designed consensus of standalone secondary structure predictors improves Q3 by over 3% , 2014, Journal of biomolecular structure & dynamics.

[40]  Thomas C. Freeman,et al.  TMBB-DB: a transmembrane β-barrel proteome database , 2012, Bioinform..

[41]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[42]  Lukasz Kurgan,et al.  DRNApred, fast sequence-based method that accurately predicts and discriminates DNA- and RNA-binding residues , 2017, Nucleic acids research.

[43]  David S. Goodsell,et al.  The RCSB protein data bank: integrative view of protein, gene and 3D structural information , 2016, Nucleic Acids Res..

[44]  Piero Fariselli,et al.  BCov: a method for predicting β-sheet topology using sparse inverse covariance estimation and integer programming , 2013, Bioinform..

[45]  Harinder Singh,et al.  In silico platform for predicting and initiating β‐turns in a protein at desired locations , 2015, Proteins.

[46]  Gajendra P. S. Raghava,et al.  ccPDB: compilation and creation of data sets from Protein Data Bank , 2012, Nucleic Acids Res..

[47]  Jiangning Song,et al.  PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection , 2014, PloS one.

[48]  Janet M. Thornton,et al.  Toward predicting protein topology: An approach to identifying β hairpins , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Gajendra P. S. Raghava,et al.  Identification of ATP binding residues of a protein from its primary sequence , 2009, BMC Bioinformatics.

[50]  Byungkyu Brian Park,et al.  PRIdictor: Protein-RNA Interaction predictor , 2016, Biosyst..

[51]  Shinn-Ying Ho,et al.  SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs , 2013, PloS one.