Consensus Decision for Protein Structure Classification

The fundamental aim of protein classification is to recognize the family of a given protein and determine its biological function. In the literature, the most common approaches are based on sequence or structure similarity comparisons. Other methods use evolutionary distances between proteins. In order to increase classification performance, this work proposes a novel method, namely Consensus, which combines the decisions of several sequence and structure comparison tools to classify a given structure. Additionally, Consensus uses the evolutionary information of the compared structures. Our method is tested on three databases and evaluated based on different criteria. Performance evaluation of our method shows that it outperforms the different classifiers used separately and gives higher classification perfor-mance than a free-alignment method, namely ProtClass.

[1]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[2]  András Kocsor,et al.  Tree-Based Algorithms for Protein Classification , 2008, Computational Intelligence in Bioinformatics.

[3]  Jason Weston,et al.  SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition , 2007, BMC Bioinformatics.

[4]  Kian-Lee Tan,et al.  Automatic 3D Protein Structure Classification without Structural Alignment , 2005, J. Comput. Biol..

[5]  A. Biegert,et al.  Sequence context-specific profiles for homology searching , 2009, Proceedings of the National Academy of Sciences.

[6]  Zhiping Weng,et al.  FAST: A novel protein structure alignment algorithm , 2004, Proteins.

[7]  Jürgen Pleiss,et al.  Lipase Engineering Database , 2000, German Conference on Bioinformatics.

[8]  András Kocsor,et al.  A Protein Classification Benchmark collection for machine learning , 2007, Nucleic Acids Res..

[9]  Ambuj K. Singh,et al.  Decision Tree Based Information Integration for Automated Protein Classification , 2005, J. Bioinform. Comput. Biol..

[10]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[11]  Yuan Qi,et al.  SCOPmap: Automated assignment of protein structures to evolutionary superfamilies , 2004, BMC Bioinformatics.

[12]  J. Pleiss,et al.  Structural classification by the Lipase Engineering Database: a case study of Candida antarctica lipase A , 2010, BMC Genomics.

[13]  Pooja Jain,et al.  Automatic structure classification of small proteins using random forest , 2010, BMC Bioinformatics.

[14]  J A Eisen,et al.  Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. , 1998, Genome research.

[15]  Aysam Guerler,et al.  Novel protein folds and their nonsequential structural analogs , 2008, Protein science : a publication of the Protein Society.

[16]  Mary Jo Ondrechen,et al.  Functional classification of protein 3D structures from predicted local interaction sites. , 2010, Journal of bioinformatics and computational biology.

[17]  Yan Yuan Tseng,et al.  Classification of protein functional surfaces using structural characteristics , 2012, Proceedings of the National Academy of Sciences.

[18]  Naoto Morikawa,et al.  Discrete differential geometry of tetrahedrons and encoding of local protein structure , 2007, ArXiv.

[19]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[20]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.