ProDom: Automated Clustering of Homologous Domains

The ProDom database is a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases. An associated database, ProDom-CG, has been derived as a restriction of ProDom to completely sequenced genomes. The ProDom construction method is based on iterative PSI-BLAST searches and multiple alignments are generated for each domain family. The ProDom web server provides the user with a set of tools to visualise multiple alignments, phylogenetic trees and domain architectures of proteins, as well as a BLAST-based server to analyse new sequences for homologous domains. The comprehensive nature of ProDom makes it particularly useful to help sustain the growth of InterPro.

[1]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[2]  J. Wootton,et al.  Analysis of compositionally biased regions in sequence databases. , 1996, Methods in enzymology.

[3]  A. Goffeau,et al.  Analysis of the chromosome sequence of the legume symbiont Sinorhizobium meliloti strain 1021 , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[5]  H. Bernstein Recent changes to RasMol, recombining the variants. , 2000, Trends in biochemical sciences.

[6]  Alex Bateman,et al.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites , 2001, Nucleic Acids Res..

[7]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[8]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[9]  Jérôme Gouzy,et al.  Browsing protein families via the 'Rich Family Description' format , 1999, Bioinform..

[10]  Thure Etzold,et al.  SRS - an indexing and retrieval tool for flat file data libraries , 1993, Comput. Appl. Biosci..

[11]  J L Sussman,et al.  Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules. , 1998, Acta crystallographica. Section D, Biological crystallography.

[12]  Amos Bairoch,et al.  The PROSITE database, its status in 2002 , 2002, Nucleic Acids Res..

[13]  W. Kabsch,et al.  How good are predictions of protein secondary structure? , 1983, FEBS letters.

[14]  Jérôme Gouzy,et al.  ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons , 2000, Nucleic Acids Res..

[15]  Jérôme Gouzy,et al.  Whole Genome Protein Domain Analysis using a New Method for Domain Clustering , 1999, Comput. Chem..

[16]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..