SitEx: a computer system for analysis of projections of protein functional sites on eukaryotic genes

Search of interrelationships between the structural–functional protein organization and exon structure of encoding gene provides insights into issues concerned with the function, origin and evolution of genes and proteins. The functions of proteins and their domains are defined mostly by functional sites. The relation of the exon–intron structure of the gene to the protein functional sites has been little studied. Development of resources containing data on projections of protein functional sites on eukaryotic genes is needed. We have developed SitEx, a database that contains information on functional site amino acid positions in the exon structure of encoding gene. SitEx is integrated with the BLAST and 3DExonScan programs. BLAST is used for searching sequence similarity between the query protein and polypeptides encoded by single exons stored in SitEx. The 3DExonScan program is used for searching for structural similarity of the given protein with these polypeptides using superimpositions. The developed computer system allows users to analyze the coding features of functional sites by taking into account the exon structure of the gene, to detect the exons involved in shuffling in protein evolution, also to design protein-engineering experiments. SitEx is accessible at http://www-bionet.sscc.ru/sitex/. Currently, it contains information about 9994 functional sites presented in 2021 proteins described in proteomes of 17 organisms.

[1]  Gautier Koscielny,et al.  Ensembl’s 10th year , 2009, Nucleic Acids Res..

[2]  Valentin A. Ilyin,et al.  Structural exon database, SEDB, mapping exon boundaries on multiple protein structures , 2004, Bioinform..

[3]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..

[4]  Annabel E. Todd,et al.  Evolution of function in protein superfamilies, from a structural perspective. , 2001, Journal of molecular biology.

[5]  Ashwini Bhasi,et al.  ExDom: an integrated database for comparative analysis of the exon–intron structures of protein domains in eukaryotes , 2008, Nucleic Acids Res..

[6]  Christine A. Orengo,et al.  Gene3D: merging structure and function for a Thousand genomes , 2009, Nucleic Acids Res..

[7]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[8]  Jérôme Gouzy,et al.  ProDom: Automated Clustering of Homologous Domains , 2002, Briefings Bioinform..

[9]  N. Grishin Fold change in evolution of protein structures. , 2001, Journal of structural biology.

[10]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[11]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[12]  Angel Herráez,et al.  Biomolecules in the computer: Jmol to the rescue , 2006, Biochemistry and molecular biology education : a bimonthly publication of the International Union of Biochemistry and Molecular Biology.

[13]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[14]  K Henrick,et al.  Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. , 2004, Acta crystallographica. Section D, Biological crystallography.

[15]  Amos Bairoch,et al.  PROSITE, a protein domain database for functional characterization and annotation , 2009, Nucleic Acids Res..

[16]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[17]  Vladimir A. Ivanisenko,et al.  PDBSite: a database of the 3D structure of protein functional sites , 2004, Nucleic Acids Res..

[18]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[19]  Tin Wee Tan,et al.  XdomView: protein domain and exon position visualization , 2003, Bioinform..

[20]  Vladimir A. Ivanisenko,et al.  PDBSiteScan: a program for searching for active, binding and posttranslational modification sites in the 3D structures of proteins , 2004, Nucleic Acids Res..

[21]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[22]  Janet M. Thornton,et al.  The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data , 2004, Nucleic Acids Res..

[23]  Anton Nekrutenko,et al.  Signatures of domain shuffling in the human genome. , 2002, Genome research.

[24]  Didier Rognan,et al.  sc-PDB: an Annotated Database of Druggable Binding Sites from the Protein Data Bank , 2006, J. Chem. Inf. Model..

[25]  Ronald W. Davis,et al.  Allele quantification using molecular inversion probes (MIP) , 2005, Nucleic acids research.

[26]  Nicola D. Gold,et al.  SitesBase: a database for structure-based protein–ligand binding site comparisons , 2005, Nucleic Acids Res..

[27]  S R Jordan,et al.  Structural convergence during protein evolution. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[28]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.