DomAns - pattern based method for protein domain boundaries prediction and analysis

Abstract. Determination of the native folded structure for a particular protein is a milestone towards understanding its function, and in most cases, can be done experimentally. However, the ability to predict in silico protein structure and related features would represent a fundamental breakthough in structural biology. The ability to predict domains in proteins is amongst the most important tasks needed for efective functional classification, homology-based structure prediction, structural genomics, as it makes function prediction easier. In this paper, we present the DomAnS, protein domain prediction approach, that is based on pattern alignment. DomAnS allows rapid screening for potential domain regions with the ability to recognize the most promising regions where domains might exists. The combination of the DomAnS algorithm with specialized databases that contains all known domains, allows us to find domain regions without solving 3D structure. Our approach has been tested on CASP7 data, and for 28 targets gave the best overall score.

[1]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[2]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[3]  Woei-Jyh Lee,et al.  Evaluation of domain prediction in CASP6 , 2005, Proteins.

[4]  Robert B. Russell,et al.  GlobPlot: exploring protein sequences for globularity and disorder , 2003, Nucleic Acids Res..

[5]  Ralf Zimmer,et al.  SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles , 2006, Bioinform..

[6]  Harpreet Kaur Saini,et al.  BIOINFORMATICS APPLICATIONS NOTE Structural bioinformatics Meta-DP: domain prediction meta-server , 2022 .

[7]  James E. Bray,et al.  The CATH database: an extended protein family resource for structural and functional genomics , 2003, Nucleic Acids Res..

[8]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[9]  B. Rost,et al.  Sequence-based prediction of protein domains. , 2004, Nucleic acids research.

[10]  Liam J. McGuffin,et al.  Protein structure prediction servers at University College London , 2005, Nucleic Acids Res..

[11]  Robert D. Finn,et al.  iPfam: visualization of protein?Cprotein interactions in PDB at domain and amino acid resolutions , 2005, Bioinform..

[12]  Jacek Blazewicz,et al.  Some operations research methods for analyzing protein sequences and structures , 2006, 4OR.

[13]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[14]  Chris Sander,et al.  Touring protein fold space with Dali/FSSP , 1998, Nucleic Acids Res..