Supervised classification of protein structures based on convex hull representation

One of the central problems in functional genomics is to establish the classification schemes of protein structures. In this paper the relationship of protein structures is uncovered within the framework of supervised learning. Specifically, the novel patterns based on convex hull representation are firstly extracted from a protein structure, then the classification system is constructed and machine learning methods such as neural networks, Hidden Markov Models (HMM) and Support Vector Machines (SVMs) are applied. The CATH scheme is highlighted in the classification experiments. The results indicate that the proposed supervised classification scheme is effective and efficient.

[1]  S. Pongor,et al.  Protein fold similarity estimated by a probabilistic approach based on Cα-Cα distance comparison , 2002 .

[2]  Xiang-Sun Zhang,et al.  Neural networks in optimization , 2000 .

[3]  Pierre Baldi,et al.  Bioinformatics - the machine learning approach (2. ed.) , 2000 .

[4]  Tim J. P. Hubbard,et al.  SCOP: a Structural Classification of Proteins database , 1999, Nucleic Acids Res..

[5]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[6]  F. Ferrè,et al.  Protein surface similarities: a survey of methods to describe and compare protein surfaces , 2000, Cellular and Molecular Life Sciences CMLS.

[7]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[8]  Oleg V. Tsodikov,et al.  Novel computer program for fast exact calculation of accessible and molecular surface areas and average surface curvature , 2002, J. Comput. Chem..

[9]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[10]  Yong Wang,et al.  Exploring Protein's Optimal Hp Configurations by Self-organizing Mapping , 2005, J. Bioinform. Comput. Biol..

[11]  Luonan Chen,et al.  Automatic Classification of Protein Structures Based on Convex Hull Representation by Integrated Neural Network , 2006, TAMC.

[12]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[13]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[14]  Xiang-Sun Zhang,et al.  Comparison of protein structures by multi-objective optimization. , 2005, Genome informatics. International Conference on Genome Informatics.

[15]  S. Pongor,et al.  Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison. , 2002, Journal of molecular biology.

[16]  Dong Xu,et al.  ProteinDBS: a real-time retrieval system for protein structure comparison , 2004, Nucleic Acids Res..

[17]  D T Jones,et al.  A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. , 1999, Structure.

[18]  Xiang-Sun Zhang,et al.  Exploring the Classification of Protein Structures on Geometric Patterns by Neural Networks , 2006 .

[19]  Oliviero Carugo,et al.  Rapid Methods for Comparing Protein Structures and Scanning Structure Databases , 2006 .

[20]  P. Røgen,et al.  Automatic classification of protein structure by using Gauss integrals , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  W. Pearson,et al.  Sensitivity and selectivity in protein structure comparison , 2004, Protein science : a publication of the Protein Society.

[22]  Xiang-Sun Zhang,et al.  An Attempt to Explore the Similarity of Two Proteins by Their Surface Shapes , 2006 .